Closed sejmodha closed 4 years ago
Thanks for the bug report @sejmodha. I can confirm it fails for this project id. I am looking into this now.
This is a long shot, but do you know why ENA has multiple files for this SRR:
I thought 1 would be a merged version of 2+3 (which would not make sense, but still) - which is not the case.
> zcat SRR5681734_1.fastq.gz | head
@SRR5681734.1 1/1
CTAGCGGATGAGCTGTGGATAGGGGTGAAAGGCTAAACAAACTTGGAAATAGCTGGTTCTCTCCGAAAACTATTTAGGTAGTGCCTCAAGT
+
GHFGJJJIJJIIHIIHGIFHGIJJJ?FHHIIJJJJJIJJJIIJJIIHHGHHHE@DFDFEEEEEEDDDDDCD@CEEEDDCCDDCCCCDDDDE
@SRR5681734.2 2/1
TGTCCGGGACGATAATGACGGTACCGGAAGAATAAGCCCCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGGGGGCTAGCGTTGCT
+
HBADAGGHIIIIIIIIIIIIIIIIIIIIIIGIIIIIIIHGFDDDDDDCDCD?BDDDCDCDDDBBBB>@BEEDDDDDDDDDDDDDDDDDDDD
@SRR5681734.3 3/1
AATGATGTCGATGCGGGGCAGCAACTCTTCGATGGTCAGGCCCAGCTTGTCTCCGCCGGCCTCTACCGTGTTGAGGAATCCGATCGACAGG
zcat SRR5681734.fastq.gz | head
@SRR5681734.52646 52646/1
AGAGCTTGCGACGTCGGGCTTGATCCCGGTGGCCGTAATAACGGAGAAACCAATACAGGTTCGAGAGACGATCTGCCCAGGGTAGA
+
FFHHIIGBDHHIEF@@GGIEGEFGEGIIIHFDEECAAB@DCCBBBBBBCCBBBC@C@CCCCCBBCB@@??-8A<((4?CBB53>?@
@SRR5681734.52647 52647/1
CGCGCAGGCTAAAGCGCTTTTTGGGGTGCTTTTTGAGGTGCTCGTAAATCCGTTGTTCTAGCATGATGTCTTCAGAACGAGGCGCTCCTCG
+
FHHFIG?FGC@FG?CGGIIGEDHI>BFBHIIGIF:;?C.6@AA1>?BBECAB?:??5>@3>@:@C:>@B>CCCCCCCC5)9<>B@B9@??B
@SRR5681734.52648 52648/1
CTTACCTCCAGAGCGAAAGCAGCCGCCATCTGACCTCACCCAGCCGCCTCCGCAAATACGCTGCGGAAATTGAATGTATCAAATCCGCCGA
> zcat SRR5681734_2.fastq.gz | head
@SRR5681734.1 1/2
TCTCCCAAGCTGTACTCATCGGTATTCGGAGTTTGCAATGGTTTGGTAAGTCGCCATGACCCCCTAGCCATAACAGTGCTCTACCCCCGAT
+
HHHHJJJIJJJJJBHJJIJJJJGHIIJJJDGIIIIJGEHIIGHIJJHIJJIGGIGGHHEHFFFFDDEDDDDDDDCDDDDDCDCDDDDBDBD
@SRR5681734.2 2/2
GTTGGCCGCCTTCGCCACTGGTGTTCTTGCGAATATCTACGAATTTCACCTCTACACTCGCAGTTCCACCAACCTCTACCAAACTCAAGCC
+
HHHHJJJIJJJJJJJJIJJJGBGIIJJIIIJJJIIJGIJJJJIHHHHGHFFFDFFCE;@ABDDDDDDEDD?BDDDD@CCDCCBDCDDDCDD
@SRR5681734.3 3/2
AACACCATCTCGGCCCAAACGGCCATGAACTCCATCGACATCGATGTCGGGGGGACCTTTACCGATCTCGTGCTGACCCTGGACGGGGAGC
The layout is single-end on both SRA and ENA, yet ENA has a paired end version of the fastqs.
https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR5681734 https://www.ebi.ac.uk/ena/browser/view/SRR5681734
This has been fixed in master. Thanks once again for reporting. I have also contacted ENA.
https://colab.research.google.com/drive/1THLcuzmW7ESWQbw2hnmb4tHxCGdrJy8n?usp=sharing
Thanks for your help!
Description
I am trying to extract the metadata using Python API for a number of BioProjects and it works fine for most BioProject accessions except in some cases
--detailed=True
results inValueError
What I Did
This results in: