There are a few commands that include contig names in filenames -- right now it's just phasing commands:
strainFlye smooth create (output reads for each contig are named [contig].fasta.gz)
strainFlye smooth assemble (output LJA assemblies for each contig are written to a folder named [contig])
strainFlye link nt (output pickle files are named [contig]_pos2nt2ct.pickle and [contig]_pospair2ntpair2ct.pickle)
strainFlye link graph (output graphs, regardless of format, include [contig] as a prefix)
In most cases, contig names should be restricted to [a-zA-Z0-9_-.], and should thus be fine as filenames. But I'm sure eventually we'll start seeing weird contig names with spaces or other characters that will mess this up.
I'm not sure it's worth trying to anticipate and address these problems in advance (we could modify the FASTA-loading parts of the code to do some validation on contig names), but I'm making this issue just to catalog what parts of the code this problem touches at the moment.
There are a few commands that include contig names in filenames -- right now it's just phasing commands:
strainFlye smooth create
(output reads for each contig are named[contig].fasta.gz
)strainFlye smooth assemble
(output LJA assemblies for each contig are written to a folder named[contig]
)strainFlye link nt
(output pickle files are named[contig]_pos2nt2ct.pickle
and[contig]_pospair2ntpair2ct.pickle
)strainFlye link graph
(output graphs, regardless of format, include[contig]
as a prefix)In most cases, contig names should be restricted to
[a-zA-Z0-9_-.]
, and should thus be fine as filenames. But I'm sure eventually we'll start seeing weird contig names with spaces or other characters that will mess this up.I'm not sure it's worth trying to anticipate and address these problems in advance (we could modify the FASTA-loading parts of the code to do some validation on contig names), but I'm making this issue just to catalog what parts of the code this problem touches at the moment.