ncbi / sra-human-scrubber

An SRA tool that takes as input local fastq file from a clinical infection sample, identifies and removes any significant human read, and outputs the edited (cleaned) fastq file that can safely be used for SRA submission.
Other
42 stars 5 forks source link

Can't use stdin and keep spots #24

Closed mbhall88 closed 11 months ago

mbhall88 commented 1 year ago

If piping the fastq file in through stdin (becase gzip isn't supported #4), and also specifying to keep removed spots (-r) I get the following error

Traceback (most recent call last):
  File "/opt/scrubber/scripts/cut_spots_fastq.py", line 85, in <module>
    main()
  File "/opt/scrubber/scripts/cut_spots_fastq.py", line 60, in main
    frs = open(f_removed_spots, "w")
FileNotFoundError: [Errno 2] No such file or directory: '/dev/fd/63.removed_spots'

Which stems from

https://github.com/ncbi/sra-human-scrubber/blob/38163116140e29897e32137c1c53b091f8de694c/scripts/cut_spots_fastq.py#L55

which seems to assume there is an input fastq file passed on the command line.

So I guess naitve support for compressed files should be implemented (which I would vote for regardless), or the -r option can take an optional filepath to specify where you want the removed spots saved to.

bede commented 12 months ago

Agree with @mbhall8 – since gzip input isn't supported, it's important that piped input is fully supported

multikengineer commented 11 months ago

Apologies, and thanks for the report. I will fix that and am preparing a minor release in the next week or so.

multikengineer commented 11 months ago

This should be fixed now in 2.2.0. See changelog and readme, and please let me know of any issues.

        -u <user_named_file>; Save identified spots to <user_named_file>.
        NOTE: Required with -r if output is stdout, otherwise optional.