ncbi / fcs

Foreign Contamination Screening caller scripts and documentation
Other
88 stars 12 forks source link

[BUG]: `clean genome` introduces empty lines #43

Closed schellt closed 7 months ago

schellt commented 1 year ago

Describe the bug When running fcs.py clean genome empty lines (extra newlines) are introduced in fasta output files after every end of a sequence, which were not present before. This causes troubles in some tools for downstream analyses.

To Reproduce

fcs.py --image=$SIF clean genome \
    -i assembly.fa \
    --action-report assembly.fcs_gx_report.txt \
    --contam-fasta-out assembly.CONTAM.fa \
    --min-seq-len 0 \
    -o assembly.filter.fa

Software versions (please complete the following information):

Log Files Let me know if you need those.

Additional context None.

etvedte commented 1 year ago

Hello,

We've been able to reproduce this issue. For now, you can use the following command on your output FASTA as a workaround: sed -i '/^$/d' file.fa

We will address this issue in an upcoming FCS-GX release, but do not currently have a timetable for that.

Eric

etvedte commented 7 months ago

This should be fixed in the new FCS release v0.5.0. Please re-open if you are still seeing the same problem.