Open SeereouslyDrewNichols opened 4 months ago
Thanks for the feedback. I cannot fix 1.8.1, but this should not be a problem in 1.9.1, since it writes libraries in .parquet format. I will also check if there's still an issue with trailing line break.
Best, Vadim
Any chance of releasing the 1.8.1 code base? I can patch it myself.
No, it's intended to be closed source
There is an issue w/ DIANN when using the --gen-spec-lib flag to generate a library when the gene ID is at the end of a fasta description. Concretely, if the fasta description for an entry in the fasta file looks like this, where GN=XXXX is the last entry, DIANN picks up the linebreak as well as the GN=XXXX and includes that in the generated library. That results in a malformed tsv file where there is a linebreak in the Genes column. (See attached picture). Would it be possible to patch v1.8.1 and above w/ this fix?