Open hexylena opened 4 years ago
What is the initial script that populates the folders ? Which API routes does it use ?
@mvdbeek this one https://github.com/usegalaxy-eu/shared-data/blob/master/run.sh
I've moved this to ephemeris then. I don't think there is a logical path towards de-duplication on Galaxy's end (at least not without either a checksum or something else that can identify a piece of data), this should be handled in the setup-data-libraries
script IMO.
Sounds good. Thanks for the move, I should've made it here in the first place.
@Slugger70 @natefoo @lecorguille this issue affects all of you.
So this is quite unfortunate. Roughly every time it runs, it creates some duplicates.
EU has seen this quite prominently, the other servers less so. I'd never seen it on them until I wrote the script to check them just now, and clearly it has been going on for quite some time judging by the counts.
https://github.com/usegalaxy-eu/shared-data/blob/master/no-dupes.sh is the script to check, I'm just dumping the contents of the GTN folder.
Also that API really probably doesn't need enforced authentication, since I can browse those while anonymous on the web.
I can add another script to try and remove duplicates, but, shared data already has one script hacking around upload permissions, another feels like too much duct-tape.