Phelimb / BIGSI

BItsliced Genomic Signature Index - Efficient indexing and search in very large collections of WGS data
http://www.bigsi.io
MIT License
124 stars 13 forks source link

Broken prebuilt index download link in bigsi.readme.io #48

Closed luizirber closed 5 years ago

luizirber commented 5 years ago

Hi!

https://bigsi.readme.io/ points to http://ftp.ebi.ac.uk/pub/software/bigsi/nat_biotech_2018/all-microbial-index-v0.3 , which is a 404. Seems like the correct link is http://ftp.ebi.ac.uk/pub/software/bigsi/nat_biotech_2018/all-microbial-index-v03/

(I was going to open a PR, but I don't know where the docs are...)

luizirber commented 5 years ago

(Following a suggestion from https://twitter.com/darwinbandoy/status/1121550099455152133)

This wget incantation works for downloading the BIGSI index from the correct link:

$ wget -c -e robots=off --cut-dirs 6 -m -np http://ftp.ebi.ac.uk/pub/software/bigsi/nat_biotech_2018/all-microbial-index-v03/

The robots.txt disallow access to mirroring, which is understandable but annoying in this case...

iqbal-lab commented 5 years ago

@Phelimb is on a plane, we might not address this til next week. But to clarify, this index is frozen, why does mirroring matter? This is purely to match the publication. I've talked on twitter about a live index, but that will be elsewhere (and not a single index, but multiple smaller ones, distributed)

luizirber commented 5 years ago

oh, mirroring is only important to have a one-liner, other approaches would involve more work =P

Looking forward for the live index!

Phelimb commented 5 years ago

Thanks @luizirber. I've fixed this URL on https://bigsi.readme.io/ (which generally is editable as a wiki, but not on the landing page).

I'm afraid I don't think I can help with the ftp mirroring as this is managed by the EBI and I don't think I have access to robots.txt.

Going to close this issue now, but please reopen if I can help further.