ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
Apache License 2.0
57 stars 13 forks source link

Help with locating media files on TACC #7877

Closed ebraker closed 1 week ago

ebraker commented 2 weeks ago

I need to access files on TACC that I can no longer find. I have 21 daily folders that maybe are in this tar file (arctos-pgdata.tar.bz2)?

image

However, when I click on the zipped file, it just downloads (says it will take more than 80 minutes), so I cancelled the procedure.


We essentially went back and brightened all of our early jpeg images for the oMeso project. We've already overwritten all the files present in /corral-tacc/projects/arctos/web/ucm/oMeso_herps, but again, we have 21 folders outstanding that I can't find. I'm assuming they are in the tar file, but they could be somewhere else(?). I searched around in the arctos project but couldn't find anything.

These are the missing folders, all the images contained within have the following base URL: https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/ (e.g., https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/2020-11-24/UCM_HERP_51385_Urosaurus_nigricaudus_ventral.jpg).

2020-11-24 2020-12-10 2021-01-18 2021-01-25 2021-02-01 2021-02-08 2021-03-02 2021-03-22 2021-03-30 2021-04-12 2021-04-27 2021-05-04 2021-05-25 2021-06-07 2021-06-08 2021-06-15 2021-06-28 2021-07-05 2021-07-12 2021-07-27 2021-08-02

ebraker commented 2 weeks ago

Alternatively, I have all of the old media bulkloaders so I could reload all of these folders to the new allocation, /corral-tacc/projects/arctos/web/ucm/oMeso_herps, however I would need help editing the exiting Arctos media to have a "ucm" vs "UCM" in the URL (maybe that could be magiced?). Or help with deleting all those images and reloading, but that seems like a pain...

dustymc commented 2 weeks ago

@ebraker that would have been a direct allocation, not through Arctos, Chris Jordan should be able to point you in the right direction.

ebraker commented 2 weeks ago

@dustymc -Chris says these images are on an old iRODS system and the original access method to them is shut down.

Is any chance that you could globally update all media objects with URIs that contain the address "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" to "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" (essentially replace UCM with ucm)?

If this is a possibility, I will go ahead and copy the 19 folders worth of photos that I have stored on our imaging computer onto the ucm/oMeso_Herps web directory and the updated URI should correctly point to them. Then I can just ask Chris to delete the UCM/oMeso_Herps folder on the old iRODs system.

If not, I will see what I can figure out with Chis.

dustymc commented 2 weeks ago

@ebraker yes I can make replace-updates.

ebraker commented 2 weeks ago

Ooh, great! I will put this on my list for tomorrow and message you when I'm ready to run that update. Thanks!

ebraker commented 2 weeks ago

@dustymc OK, I have moved all my folders to the /corral-tacc/projects/arctos/web/ucm/oMeso_herps allocation. Will you replace all "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" with "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" please?

dustymc commented 2 weeks ago

replace all "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" with "https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/" please?

I think you had a copypaste fail or something. I took a guess, let me know if this looks like the right thing and everything and such, or where I got lost.

temp_me_up.csv.zip

arctosprod@arctos>> select count(*) from temp_me_up;
 count 
-------
   359

arctosprod@arctos>> select split_part(replace(media_uri,'https://web.corral.tacc.utexas.edu/UCM/oMeso_Herps/',''),'/',1) f, count(*) c from temp_me_up group by f order by f;
     f      | c  
------------+----
 2020-11-24 | 56
 2020-12-10 | 38
 2021-01-18 | 13
 2021-01-25 |  9
 2021-02-01 | 13
 2021-02-08 | 23
 2021-03-02 | 24
 2021-03-22 | 21
 2021-03-30 | 20
 2021-04-12 | 26
 2021-04-27 | 24
 2021-05-04 | 22
 2021-05-25 | 10
 2021-06-07 | 20
 2021-06-08 |  8
 2021-06-15 | 14
 2021-06-28 |  6
 2021-07-05 | 12
(18 rows)
ebraker commented 2 weeks ago

Doh! You guessed right! Thank you!

dustymc commented 2 weeks ago

Sorry I wasn't clear, I DID NOT update anything, if that looks right I can though.

ebraker commented 2 weeks ago

oh, haha, YES! that looks right!

dustymc commented 1 week ago

done