LAAC-LSCP / ChildProject

Python package for the management of day-long recordings of children.
https://childproject.readthedocs.io
MIT License
13 stars 5 forks source link

Zooniverse upload tracking #427

Closed LoannPeurey closed 1 year ago

LoannPeurey commented 1 year ago

Is your feature request related to a problem? Please describe. When uploading to zooniverse, the update of the csv for all clips is done only at the end, resulting in largely successful uploads not being recorded as such because 1 down the line failed, then the csv is to be updated manually to record those successful uploads.

Describe the solution you'd like Make it so that an occurring error will still allow the program to save the csv and then exit.

khitczenko commented 1 year ago

It works well for this error: failed to save chunk 181742. an exception has occured: Validation failed: Locations is invalid panoptes_client.panoptes.PanoptesAPIException: Validation failed: Locations is invalid

But runs into trouble for this error: panoptes_client.panoptes.PanoptesAPIException: Attempted to update a stale object: SubjectSet.

LoannPeurey commented 1 year ago

Maximum uploaded subject amount error should not allow for continuing the upload even with the ignore errors flag as it is clear everything from that point on will fail.

failed to save chunk 29736. an exception has occured: User has uploaded 253382 subjects of 250000 maximum Traceback (most recent call last): File "/scratch2/lpeurey/conda/childproject/lib/python3.7/site-packages/ChildProject/pipelines/zo oniverse.py", line 450, in upload_chunks subject.save()

Traceback (most recent call last): File "", line 1, in File "/scratch2/lpeurey/conda/childproject/lib/python3.7/site-packages/panoptes_client/subject.py", line 145, in save log_args=False, File "/scratch2/lpeurey/conda/childproject/lib/python3.7/site-packages/redo/init.py", line 170, in retry return action(*args, **kwargs) File "/scratch2/lpeurey/conda/childproject/lib/python3.7/site-packages/panoptes_client/panoptes.py", line 818, in save etag=self.etag File "/scratch2/lpeurey/conda/childproject/lib/python3.7/site-packages/panoptes_client/panoptes.py", line 408, in post retry=retry, File "/scratch2/lpeurey/conda/childproject/lib/python3.7/site-packages/panoptes_client/panoptes.py", line 285, in json_request json_response['errors'] panoptes_client.panoptes.PanoptesAPIException: User has uploaded 253382 subjects of 250000 maximum

LoannPeurey commented 1 year ago

We may want to secure the writing of the csv to complete even in extrene cases (eg SIGTERM), this is done by either writing periodically or handling termination signals / keyboard interruptions etc

LoannPeurey commented 1 year ago

405 should be implemented as well

LoannPeurey commented 1 year ago

regarding orphan subject, this thread is relevant : https://github.com/zooniverse/panoptes/issues/1529

LoannPeurey commented 1 year ago

The chunks csv file does record the subject set name but not the id. The name is susceptible to change and the id is not and easier to match and get , so we should record it, probably in addition to name to keep backwards compatibility