grp-bork / spire_contribute

3 stars 0 forks source link

Download errors #5

Closed snayfach closed 9 months ago

snayfach commented 10 months ago

It appears there is a permissions issue on the MAG datasets. After navigating to https://spire.embl.de/study/MetaSUB?page=1. And clicking the download link https://swifter.embl.de/~fullam/spire/compiled/MetaSUB_spire_v1_MAGs.tar, the follow error is returned:

You don't have permission to access /~fullam/spire/compiled/MetaSUB_spire_v1_MAGs.tar on this server.

The same error is encountered for other studies.

snayfach commented 10 months ago

I ended up downloading the MAGs individually using wget http://spire.embl.de/download_file/${MAG} but fixing the issue should help other users

snayfach commented 10 months ago

Update: this worked for most MAGs, but there are some with errors:

wget http://spire.embl.de/download_file/spire_mag_01272848
--2023-12-28 18:53:44--  http://spire.embl.de/download_file/spire_mag_01272848
Resolving spire.embl.de (spire.embl.de)... 194.94.44.56
Connecting to spire.embl.de (spire.embl.de)|194.94.44.56|:80... connected.
HTTP request sent, awaiting response... 500 INTERNAL SERVER ERROR
2023-12-28 18:53:45 ERROR 500: INTERNAL SERVER ERROR.
snayfach commented 10 months ago

Overall 238249 / 1158554 genomes listed in the spire_v1_genome_metadata.tsv.gz file result in an internal server error when downloading using their id: wget http://spire.embl.de/download_file/spire_mag_01272848.

fullama commented 10 months ago

ok i fixed the obvious ones.. i have a script running to check every url to find any others that might be broken..

snayfach commented 10 months ago

Here's the list of ones I found in case it's helpful spire_mags.txt.zip

snayfach commented 10 months ago

It looks like the problem persists for 80% of the 238249 MAGs with the internal server error

snayfach commented 10 months ago

Is there any estimate on when this might be fixed?

fullama commented 10 months ago

Sorry I'm not working most of this week.. it's at the top of my list though so not too long

fullama commented 10 months ago

should be all sorted now.. let me know if you come across any more that aren't accessible!

snayfach commented 10 months ago

I just tried to download ~50 random ones that previously failed, and 5 of these still have errors:

spire_mag_01784224.fna.gz spire_mag_01868122.fna.gz spire_mag_01884850.fna.gz spire_mag_01888335.fna.gz spire_mag_01891463.fna.gz

$ wget http://spire.embl.de/download_file/spire_mag_01891463
--2024-01-05 22:30:05--  http://spire.embl.de/download_file/spire_mag_01891463
Resolving spire.embl.de (spire.embl.de)... 194.94.44.56
Connecting to spire.embl.de (spire.embl.de)|194.94.44.56|:80... connected.
HTTP request sent, awaiting response... 500 INTERNAL SERVER ERROR
2024-01-05 22:30:06 ERROR 500: INTERNAL SERVER ERROR.
fullama commented 10 months ago

so there is a small wrinkle in that our filesystem underwent an upgrade over the break and is not behaving at the minute.. i tried setting all the permissions again and as far as i can tell it should be ok (again).. 🤞 how does it look from your side now?

snayfach commented 9 months ago

Looks fixed, thank you!