Closed mfidino closed 4 years ago
Hi @mfidino -- Let me start with a couple clarifications about the CLI's behavior:
--allow-missing
option was meant to allow subject with no media files to be added, not to allow subjects where media files are missing or unable to be uploaded.file_name_4
is left blank, the code is correctly assuming that you meant to include a filename there, and when it can't find and upload a media file whose name is ``, then it throws an error.I argue that the current behavior of the code is a feature, not a bug: it is better to throw errors regarding potential missing media files rather than unexpectedly upload subjects that are missing data.
Also note: if you had ordered your manifest differently (e.g., starting with an entry where only three image filenames were included), the code would have run successfully but in an unexpected way: the file_name_4
would have been interpreted as text metadata and all subjects would have been uploaded with only 3 JPG files each.
There is a workaround for your case where the number of media files varies per subject: create a new manifest file for each group of subjects depending on the number of media files (e.g., those with 2 JPG images, those with 3 JPG images, those with 4 JPG images) and use the CLI to upload each batch, one at a time. Each upload can point to the same subject set, so it just requires a little additional bookkeeping when preparing the data for upload.
I had some back and forth email with @trouille and they suggested I open up an issue here. When uploading subjects to a subject set we have been seeing issues (at times) when the function errors out when there are 4+ images tied to a subject id. In this specific case a user is trying to upload a maximum of 6 images per subject id (though I tested this with 4 and 5 images per id and the same issue occurred). This has been happening when not every subject id has an image in the fourth, fifth, or sixth image path column (depending on the number of images total). For example, running:
panoptes subject-set upload-subjects --allow-missing -m image/JPG {subject set id here} manifest_1.csv
with this manifest:
The above call will error out with:
Error: File "C:" could not be found.
Conversely, this manifest would upload:
None of the images are corrupted. Looking through a bit of the code for
upload-subjects
I'm guessing it is related to this function here: https://github.com/zooniverse/panoptes-cli/blob/cb2e6fc3a17644055102f396344f8390c3878d3f/panoptes_cli/commands/subject_set.py#L276-L284but I'm not 100% sure.