Open dmgaldi opened 2 years ago
Here's the code that is throwing. It could be some kind of eventual consistency issue with IRODS is it seems like the code isn't finding the files but we're finding them after the fact.
It looks like the tarball is unpacked into a staging area and then rsynced into the users directory.
https://github.com/VEuPathDB/EuPathDBIrods/blob/master/Scripts/ud.re#L119-L155
We assume when getting a list of user datasets for a user, that if the directory exists, it is a valid dataset. It's possible that the rsync is partially complete without the meta.json
and dataset.json
files existing which could cause us to fail to list the directory.
It's worth checking if this particular dataset was listed soon after it's creation, or to check the IRODS rules logs if we have them to see if they corroborate this picture.
Looks like the error came in on 08/Jul/2022|08:30:26
Looks like the write to IRODS came in 2 seconds earlier:
ESC[mESC[36m2022-07-08 08:30:23.140 [rid: ] DEBUG Irods:47 - writing dataset dataset_u40_t1657283421542_p38.tgz to iRODS /ebrc/workspaces/lz
ESC[mESC[36m2022-07-08 08:30:24.007 [rid: ] DEBUG Irods:65 - writing flag /ebrc/workspaces/flags/dataset_u40_t1657283421542_p38.txt to iRODS /ebrc/workspaces/flags
We confirmed that the genelist.txt
file was created in the IRODS workspace at 08/Jul/2022|08:30:27
which is one second after the the request came in and two seconds after the tarball was written. This more or less confirms the above theory.
Next step is to follow up with front-end folks to see if they have any idea why we might be running into this issue now since theoretically this bug would have been there forever.
i experienced this myself.. i think when you go quickly after upload to the All datasets page... it is clearly some kind of our of sync situation.. need more detail
E-mail from Cristina with details: