ncbi / sra-tools

SRA Tools
Other
1.07k stars 243 forks source link

make_joined_filename() with fasterq-dump #917

Closed rchikhi closed 3 months ago

rchikhi commented 3 months ago

This accession seems unreadable:

SRR6279421

$ s5cmd cp s3://sra-pub-run-odp/sra/SRR6279421/SRR6279421 SRR6279421.sra
$ fasterq-dump --fasta-unsorted --stdout SRR6279421.sra |tail
2024-03-20T13:21:11 fasterq-dump.3.1.0 err: temp_dir.c make_joined_filename() -> RC(rcVDB,rcNoTarg,rcConstructing,rcParam,rcInvalid)
spots read      : 0
reads read      : 0
reads written   : 0

Tested on a March 10 2024 build of sra-tools

wraetz commented 3 months ago

In your case the "--stdout" options is ignored, because the default split-mode of fasterq-dump is 'split-3'. This mode creates multiple output-files, and because of that --stdout is ignored. Then it cannot write to these files. We will improve the tool in the future to not silently ignore options, but make that an error and spell out the reason.

rchikhi commented 3 months ago

Oh! But, this is appears unrelated to --stdout:

$ fasterq-dump --fasta-unsorted  SRR6279421.sra
2024-03-20T14:34:58 fasterq-dump.3.1.0 err: temp_dir.c make_joined_filename() -> RC(rcVDB,rcNoTarg,rcConstructing,rcParam,rcInvalid)
spots read      : 0
reads read      : 0
reads written   : 0

But, removing --fasta-unsorted makes it output a FASTQ indeed. This is a PacBio accession, there is only one output file.

wraetz commented 3 months ago

I have to investigate this one...

wraetz commented 3 months ago

There seems to be a problem combining a Pacbio accession loaded via HDF5 (pacbio-load) and the --fasta-unsorted mode. Fixing this will take a little bit longer... ( in this case just use --fasta instead of --fasta-unsorted )

rchikhi commented 3 months ago

Thanks! good to know that --fasta fixes it for now