irods / irods_capability_automated_ingest

Other
12 stars 16 forks source link

Unreadable files should not be added to sync chunks #277

Closed alanking closed 2 weeks ago

alanking commented 3 weeks ago

In the past, finding an entry for which the executing user does not have read permissions resulted in the entry being skipped (this code is from v0.3.8):

https://github.com/irods/irods_capability_automated_ingest/blob/3fcc437818c8685f6955e2c0481c31c9b600b733/irods_capability_automated_ingest/sync_task.py#L334-L336

In v0.4.0, the entries were erroneously being added to the chunks for syncing rather than being skipped:

https://github.com/irods/irods_capability_automated_ingest/blob/c60e2cc94f3ccb182c88db2c5bf7dff7b344ed9a/irods_capability_automated_ingest/scanner.py#L106-L109

This could cause failures down the line when specific keys are accessed in the obj_stats (which would be empty in the case where the file is not readable):

https://github.com/irods/irods_capability_automated_ingest/blob/c60e2cc94f3ccb182c88db2c5bf7dff7b344ed9a/irods_capability_automated_ingest/sync_task.py#L318-L323

At the very least, we need to skip adding these to the chunk for syncing.

Additionally, we may want to log an info- or warning-level message to alert the user that these entries were not attempted.

We could also cause the sync job to fail, but just for these entries. This could be done by putting these files their own sync tasks (a chunk of 1) with a "poison pill" key to make the task fail so that the user is alerted to the files which failed to sync. That way, the rest of the sync job can complete and the failures are appropriately flagged for later handling.

trel commented 3 weeks ago

I like the separate failing task - it's clean... - and could make bookkeeping a lot more sane.

OR...

creates chaos? too many moving parts? for the admin?

logging a warning may be sufficient if we're at all worried about complexity.

trying to think what would be more surprising / annoying as the person running this sync...

feels like no errors and complaining could lead to surprising silent assumptions as soon as someone puts this in a script and sees a clean return code 0.

so, i think i'm leaning towards failed tasks, and a non-zero return code for the whole thing (if --synchronous was used).

alanking commented 3 weeks ago

trying to think what would be more surprising / annoying as the person running this sync...

* things that complained and didn't sync (and are still there) - no errors

* things that failed causing errors in my sync

feels like no errors and complaining could lead to surprising silent assumptions as soon as someone puts this in a script and sees a clean return code 0.

so, i think i'm leaning towards failed tasks, and a non-zero return code for the whole thing (if --synchronous was used).

I agree. Presently, error messages would appear... but only in the Celery logs, which the end user may not get to see. Actually, the job would eventually fail because the file couldn't be opened for read when it came time to sync the data. But it would also cause everything else in that chunk to fail, which could be confusing to sort through later.

That's where the idea came from for the doomed / poison-pill tasks, so that they would only indicate the failed files. We could even chunk these known failures if we felt so inclined.

korydraughn commented 3 weeks ago

Agreed. Giving the user some feedback is good. As soon as the user inspects the output, they'll see which files failed and which files succeeded.

alanking commented 2 weeks ago

I'm contemplating separating out this failed tasks idea to a different issue with an "enhancement" label. Could be good to do this more generally than just for this case. Thoughts?

If we do that, then this issue would be to just not add the unreadable files to the sync chunk and log an error like was being done before.

alanking commented 2 weeks ago

I went ahead and created https://github.com/irods/irods_capability_automated_ingest/issues/296 for the "doomed task" idea. We can discuss that idea further there. For this issue, the important bit is to not schedule the unreadable file for syncing.

korydraughn commented 2 weeks ago

Yep, sounds good.

alanking commented 2 weeks ago

The unreadable files are no longer enqueued, and a test has been added to demonstrate this. #296 will address the enhancement piece which would isolate failed syncs to (intentionally) failed tasks.