ewels / sra-explorer

Web application to explore the Sequence Read Archive.
https://sra-explorer.info/
GNU General Public License v2.0
203 stars 29 forks source link

SRA-explorer only retrieves the first out of many entries for an SRA sample #2

Closed FelixKrueger closed 5 years ago

FelixKrueger commented 6 years ago

As an example we have been using the SRP project number SRP098829, and then selected only the Sample Diploid_27. SRA-explorer then produces a download file like this:

ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507528/SRR5507528.sra    SRR5507528_GSM2598386_Diploid_27_Mus_musculus_OTHER.sra

The same entry produces the following download file within Labrador:

ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507528/SRR5507528.sra    SRR5507528_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507529/SRR5507529.sra    SRR5507529_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507530/SRR5507530.sra    SRR5507530_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507531/SRR5507531.sra    SRR5507531_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507532/SRR5507532.sra    SRR5507532_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507533/SRR5507533.sra    SRR5507533_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507534/SRR5507534.sra    SRR5507534_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507535/SRR5507535.sra    SRR5507535_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507536/SRR5507536.sra    SRR5507536_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507537/SRR5507537.sra    SRR5507537_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507538/SRR5507538.sra    SRR5507538_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507539/SRR5507539.sra    SRR5507539_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507540/SRR5507540.sra    SRR5507540_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507541/SRR5507541.sra    SRR5507541_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507542/SRR5507542.sra    SRR5507542_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507543/SRR5507543.sra    SRR5507543_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507544/SRR5507544.sra    SRR5507544_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507545/SRR5507545.sra    SRR5507545_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507546/SRR5507546.sra    SRR5507546_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507547/SRR5507547.sra    SRR5507547_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507548/SRR5507548.sra    SRR5507548_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR5507549/SRR5507549.sra    SRR5507549_Diploid_27.sra
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR550/SRR55075/SRR55075.sra    SRR55075_Diploid_27.sra

So in essence SRA-Explorer only retrieves the very first entry if there are more than one. I am sure it is just a one-line fix for someone with your Javascript capabilities :)

Cheers, Felix

ewels commented 5 years ago

I can't find Diploid_27 in the list of returned samples now... Can you confirm? Thanks!

FelixKrueger commented 5 years ago

Hi Phil,

I can confirm that SRA-Explorer does no longer find Diploid_27 at all anymore.

But the Sample is still present at NCB (https://www.ncbi.nlm.nih.gov/sra/SRP098829), and also Labrador still finds it:

GSM2598386 SRR5507528 SRR5507529 SRR5507530 SRR5507531 SRR5507532 SRR5507533 SRR5507534 SRR5507535 SRR5507536 SRR5507537 SRR5507538 SRR5507539 SRR5507540 SRR5507541 SRR5507542 SRR5507543 SRR5507544 SRR5507545 SRR5507546 SRR5507547 SRR5507548 SRR5507549 SRR55075 | 2019/03/02
-- | --
ewels commented 5 years ago

Ok, I figured it out. This was a relatively bad bug, I should have fixed it ages ago - sorry!

Basically, if there was more than one SRR attribute per SRA then it broke. This should be fixed now and sure enough all of these runs appear. I think it was happening to a lot of other records too.

Let me know if you spot any other problems!

FelixKrueger commented 5 years ago

Excellent, better late than never!