EI-CoreBioinformatics / mikado

Mikado is a lightweight Python3 pipeline whose purpose is to facilitate the identification of expressed loci from RNA-Seq data * and to select the best models in each locus.
https://mikado.readthedocs.io/en/stable/
GNU Lesser General Public License v3.0
97 stars 18 forks source link

Error running serialise #442

Open johannesnicolaus opened 1 year ago

johannesnicolaus commented 1 year ago

Hello. I'm getting the following error, but I'm not sure how to debug it:

Mikado crashed, cause:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/Mikado/__main__.py", line 68, in main
    args.func(args)
  File "/usr/local/lib/python3.9/site-packages/Mikado/subprograms/serialise.py", line 383, in serialise
    load_orfs(mikado_configuration, logger)
  File "/usr/local/lib/python3.9/site-packages/Mikado/subprograms/serialise.py", line 180, in load_orfs
    serializer = orf.OrfSerializer(orf_file,
  File "/usr/local/lib/python3.9/site-packages/Mikado/serializers/orf.py", line 220, in __init__
    assert os.path.exists(fasta_index)
AssertionError

The final line of the log:

2023-06-06 16:22:52,847 - serialise - serialise.py:176 - INFO - load_orfs - MainProcess - Starting to load ORF data

Is it possible that something is wrong with the ORF data? I am using TransDecoder.LongOrfs to generate ORF gff3 file. Could there be something wrong with the ORFs?

isaacvock commented 1 year ago

I ran into the same problem and am also using TransDecoder, but I don't believe it is a TransDecoder issue. I was able to solve the problem by explicitly specifying the path to the mikado prepared transcript fasta file (i.e., by passing the path to mikado serialise via the --transcripts argument. The tutorial suggested that this wasn't necessary given a configuration file produced by mikado, but perhaps I am misinterpreting.

This is the relevant code from orf.py that seems to be the cause of the error:

fasta_index = configuration.serialise.files.transcripts

...  

if isinstance(fasta_index, str):
    assert os.path.exists(fasta_index)
    fasta_index = pysam.FastaFile(fasta_index)

I have removed some of code which seems irrelevant to the error. The assertion error that we go suggests the path provided by the configuration file to this function was misspecified. This is why I think explicitly specifying the path solved the issue for me. Why it would be misspecified in the configuration file, I don't know.