Closed sagarvijaygupta closed 6 years ago
Merging #252 into master will decrease coverage by
0.64%
. The diff coverage is0%
.
@@ Coverage Diff @@
## master #252 +/- ##
==========================================
- Coverage 16.48% 15.83% -0.65%
==========================================
Files 12 13 +1
Lines 1820 1894 +74
Branches 311 328 +17
==========================================
Hits 300 300
- Misses 1518 1592 +74
Partials 2 2
Impacted Files | Coverage Δ | |
---|---|---|
migrate_files.py | 0% <0%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update f5efc2d...4c30e16. Read the comment docs.
Let's put the code in a function migrate_v1_to_v2
, as we might have other migrations in the future.
We should put a VERSION file in the data/ repository, so we know which migration should be applied.
What should we write in VERSION file?
What should we write in VERSION file?
Right now, just 1
. Then the migrate script should check the VERSION file, if it's 1
it should call migrate_v1_to_v2
(which changes it to 2
).
(The VERSION file should be added to the autowebcompat-data repo on GitLab)
Then the migrate script should check the VERSION file, if it's 1 it should call migrate_v1_to_v2 (which changes it to 2)
You mean that after checking VERSION I should call migrate_v1_to_v2
by default?
We expect future migrations of the form migrate_v2_to_v3
, migrate_3_to_v4
or migrate_v1_to_v3
, migrate_v1_to_v4
?
@marco-c should I make the to_version
as argparse input?
You mean that after checking VERSION I should call migrate_v1_to_v2 by default? We expect future migrations of the form migrate_v2_to_v3, migrate_3_to_v4 or migrate_v1_to_v3, migrate_v1_to_v4?
migrate_v1_to_v2
shouldn't always be called, but only if VERSION is == 1
.
Future migrations will be migrate_v2_to_v3
, migrate_v3_to_v4
and so on, this way we can support all possible migrations.
@marco-c should I make the to_version as argparse input?
The to_version should always be the latest (in this case 2).
Okay. So should I recursively call all of them and update the VERSION file in the end or we will just call it once and if wanted change VERSION file by ourselves and again call?
Okay. So should I recursively call all of them and update the VERSION file in the end or we will just call it once and if wanted change VERSION file by ourselves and again call?
Not recursively, but one after another. Each function should also update the VERSION file.
In pseudocode, something like this:
for i in range(CURRENT_VERSION, LATEST_VERSION - 1):
migrate_v(i)_to_v(i+1)
Please provide your feedback on this pull request here.
Privacy statement: We don't store any personal information such as your email address or name. We ask for GitHub authentication as an anonymous identifier to account for duplicate feedback entries and to see people specific preferences.
I'm adding the VERSION file to the data repository.
@marco-c Can you see the VERSION file now? I added it previously also. I realized I had to commit in the submodule also before adding.
You can't update the submodule without actually updating it on GitLab, otherwise it's not available to anyone else except you (infact, Travis fails to clone the submodule now, as the hash you provide is local to your machine).
Don't worry about adding the VERSION file, as I'm doing it now.
For the future, when you want to change something in data or tools, you actually have to send a pull request on GitLab, so that it can be merged and can be available to everybody.
Okay! I will try it next time. I actually wanted to update some new full_page_screenshots previously but was not able to at that time!
Should we keep a file along with VERSION file which corresponds to the screenshots present in that version. Since we won't be able to push images if generated from present crawler as after that migrate function will try to migrate them also. (or simply the time stamp?)
Should we keep a file along with VERSION file which corresponds to the screenshots present in that version. Since we won't be able to push images if generated from present crawler as after that migrate function will try to migrate them also. (or simply the time stamp?)
I guess it's better if we just do the migration before adding new images.
Please provide your feedback on this pull request here.
Privacy statement: We don't store any personal information such as your email address or name. We ask for GitHub authentication as an anonymous identifier to account for duplicate feedback entries and to see people specific preferences.
Migrate data image files, text files, labelled files in label_persons according to the new conventions incorporated in #88
Fixes #251