smithlabcode / methpipe

A pipeline for analyzing DNA methylation data from bisulfite sequencing.
http://smithlabresearch.org/methpipe
66 stars 27 forks source link

incompatible behavior between abismal and methpipe with chrom names #183

Closed guilhermesena1 closed 3 years ago

guilhermesena1 commented 3 years ago

When abismal maps reads to a FASTA file with chromosome names containing spaces, it keeps only the first word in the chromosome name as its RNAME field. If passed onto methpipe programs (e.g. methcounts or bsrate) and the same reference is used, the chromosome names in the SAM file, which contain only the first word, are not found in the reference.

Suggested fix: Change read_fasta_file in smithlab_cpp such that only the first word is used internally. Since both bsrate and methcounts use this function to read directories or files, it should cause consistent behavior.

andrewdavidsmith commented 3 years ago

Fixed with the read_fasta_file_short_names function instead.