kfuku52 / amalgkit

RNA-seq data amalgamation for a large-scale evolutionary transcriptomics
BSD 3-Clause "New" or "Revised" License
7 stars 1 forks source link

fastq-dump.2.8.0 err: column not found while opening table within short read archive module #131

Closed docxology closed 1 year ago

docxology commented 1 year ago

fastq-dump-2.8.0_error.txt

Full log is attached, and below is just the part with the error.

I now have the pipeline working well up to this point with 4000+ SRA, and this error is hanging up dozens-hundreds of those SRAs (everything continues fine with quant and curate when I remove the failing SRA like the one below). Thank you!


$amalgkit getfastq --id DRR129208 AMALGKIT version: 0.9.16 AMALGKIT command: /home/tet/miniconda3/bin/amalgkit getfastq --id DRR129208 AMALGKIT bug report: https://github.com/kfuku52/amalgkit/issues amalgkit getfastq: start pigz found. It will be used for compression/decompression in read name formatting. --id is specified. Downloading SRA metadata from Entrez. Entrez search term: DRR129208 Number of SRA records: 1 processing SRA records: 0 - 1 2023-06-16 10:13:18: Converting 0th sample from XML to DataFrame 2023-06-16 10:13:18: Finished converting 1 samples Filtering SRA entry with --layout: auto Individual SRA size of DRR129208: 39,143,269.0 bp Number of SRAs to be processed: 1 Total target size (--max_bp): 999,999,999,999,999 bp The sum of SRA sizes: 39,143,269.0 bp Target size per SRA: 999,999,999,999,999 bp

Processing SRA ID: DRR129208 Library layout: single Number of reads: 772,651 Single/Paired read length: 51 bp Total bases: 39,143,269 bp Processing DRR129208 as publicly available data from SRA. Previously-downloaded sra file was detected at: /media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208/DRR129208.sra Total sampled bases: 39,405,201 bp Command: parallel-fastq-dump -t 1 --minReadLen 25 --qual-filter-1 --skip-technical --split-3 --clip --gzip --outdir /media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208 --tmpdir /media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208 --minSpotId 1 --maxSpotId 772651 -s /media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208/DRR129208.sra parallel-fastq-dump stdout:

parallel-fastq-dump stderr: 2023-06-16 10:13:18,299 - SRR ids: ['/media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208/DRR129208.sra'] 2023-06-16 10:13:18,299 - extra args: ['--minReadLen', '25', '--qual-filter-1', '--skip-technical', '--split-3', '--clip', '--gzip'] 2023-06-16 10:13:18,299 - tempdir: /media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208/pfd_iffua82y 2023-06-16 10:13:18,299 - CMD: sra-stat --meta --quick /media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208/DRR129208.sra 2023-06-16 10:13:18,311 - /media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208/DRR129208.sra spots: 772651 2023-06-16 10:13:18,311 - blocks: [[1, 772651]] 2023-06-16 10:13:18,312 - CMD: fastq-dump -N 1 -X 772651 -O /media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208/pfd_iffua82y/0 --minReadLen 25 --qual-filter-1 --skip-technical --split-3 --clip --gzip /media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208/DRR129208.sra 2023-06-16T17:13:18 fastq-dump.2.8.0 err: column not found while opening table within short read archive module - failed /media/tet/56D80A6E7A225267/Transcriptome/getfastq/DRR129208/DRR129208.sra

============================================================= An error occurred during processing. A report was generated into the file '/home/tet/ncbi_error_report.xml'. If the problem persists, you may consider sending the file to 'sra@ncbi.nlm.nih.gov' for assistance.

2023-06-16 10:13:18,338 - fastq-dump error! exit code: 3

pfd did not finish safely.

kfuku52 commented 1 year ago

This is a fastq-dump error. Please try the latest version. Last month, I used amalgkit with fastq-dump v3.0.5, and it worked well for 2k+ samples.

docxology commented 1 year ago

Thank you Kenji. I followed the instructions here for updating sra-toolkit/sra-tools to fastq-dump v3.0.5 https://github.com/ncbi/sra-tools/wiki/02.-Installing-SRA-Toolkit And now things are working perfectly, will keep posted with anything else.

kfuku52 commented 1 year ago

Great! Thank you for reporting.