jason-weirather / AlignQC

Long read alignment analysis. Generate a reports on sequence alignments for mappability vs read sizes, error patterns, annotations and rarefraction curve analysis. The most basic analysis only requires a BAM file, and outputs a web browser compatible xhtml to visualize/share/store/extract analysis results.
Apache License 2.0
45 stars 10 forks source link

Error while reading reference fasta #11

Closed weedcentipede closed 6 years ago

weedcentipede commented 6 years ago

Hi, during the running of AlignQC with the reference added I got one error, pasted below. (without the reference runs perfectly)

The following is the command line: ~/AlignQC-1.2/bin/alignqc analyze MLDT_finally.bam -r MLDT-gDNA.fasta --no_annotation -o long_reads.alignqc.xhtml

The following is the error:

Reading reference fasta Reading index Traceback (most recent call last): File "/home/bio/AlignQC-1.2/bin/alignqc", line 44, in main() File "/home/bio/AlignQC-1.2/bin/alignqc", line 24, in main analyze.external_cmd(" ".join(operable_argv),version=version) File "/home/bio/AlignQC-1.2/utilities/analyze.py", line 77, in external_cmd main(args) File "/home/bio/AlignQC-1.2/utilities/analyze.py", line 44, in main prepare_all_data.external(args) File "/home/bio/AlignQC-1.2/utilities/prepare_all_data.py", line 764, in external main(args) File "/home/bio/AlignQC-1.2/utilities/prepare_all_data.py", line 72, in main make_data_bam_reference(args) File "/home/bio/AlignQC-1.2/utilities/prepare_all_data.py", line 383, in make_data_bam_reference bam_to_context_error_plot.external_cmd(cmd) File "/home/bio/AlignQC-1.2/utilities/bam_to_context_error_plot.py", line 147, in external_cmd main(args) File "/home/bio/AlignQC-1.2/utilities/bam_to_context_error_plot.py", line 47, in main epf.add_alignment(e) File "/home/bio/AlignQC-1.2/pylib/Bio/Errors.py", line 55, in add_alignment ae = AlignmentErrors(align) File "/home/bio/AlignQC-1.2/pylib/Bio/Errors.py", line 499, in init self._context_target_errors = self.get_context_target_errors() File "/home/bio/AlignQC-1.2/pylib/Bio/Errors.py", line 574, in get_context_target_errors r[t][tafter]['-']['total'] += 0.5 KeyError: '\r'

Would you please help me? Thanks, Luis Alfonso

jason-weirather commented 6 years ago

Hi Luis,

It looks like the program is hitting an error on a "carriage return". This is an ascii character that a few programs such as windows notepad insert into text documents at the end of each line. My guess is that your reference fasta has these carriage returns in them.

$ tr -d '\r' < MLDT-gDNA.fasta > MLDT-gDNA-fixed.fasta

I'd give that a try and then try again and see if it helps. Also it looks like you are using an older version of AlignQC .. theres instructions on the github on how to get the newer version if that would also be helpful. Note that the command in the new version has changed slightly so -t sets the reference transcriptome, and -g sets the reference genome. (and GTF is now accepted by default for new version).

I hope that helps!

weedcentipede commented 6 years ago

As comment of this Issue, this is what happens when you put the alignment as your genome

Traceback (most recent call last):
  File "/home/bio/anaconda3/envs/py27/bin/alignqc", line 11, in <module>
    sys.exit(entry_point())
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/alignqc/alignqc.py", line 47, in entry_point
    main(args,operable_argv)
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/alignqc/alignqc.py", line 17, in main
    analyze.external_cmd(operable_argv,version=version)
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/alignqc/analyze.py", line 88, in external_cmd
    main(args)
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/alignqc/analyze.py", line 54, in main
    prepare_all_data.external(args)
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/alignqc/prepare_all_data.py", line 844, in external
    main(args)
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/alignqc/prepare_all_data.py", line 65, in main
    make_data_bam_reference(args)
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/alignqc/prepare_all_data.py", line 404, in make_data_bam_reference
    bam_to_context_error_plot.external_cmd(cmd)
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/alignqc/bam_to_context_error_plot.py", line 146, in external_cmd
    main(args)
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/alignqc/bam_to_context_error_plot.py", line 21, in main
    ref = FASTAData(open(args.reference).read())
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/seqtools/format/fasta/__init__.py", line 126, in __init__
    self._scan_data(data)
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/seqtools/format/fasta/__init__.py", line 185, in _scan_data
    f = FASTA(m.group(0))
  File "/home/bio/anaconda3/envs/py27/lib/python2.7/site-packages/seqtools/format/fasta/__init__.py", line 74, in __init__
    name = re.match('(\S+)',header).group(1)
AttributeError: 'NoneType' object has no attribute 'group'