I am trying to run the transcriptome workflow of reat, but I am unfortunately getting stuck at the serialise step. I already had a look at the mikado issues page, but couldn't find a solution for me there.
I installed reat in a conda environment using
mamba env create -f reat/reat.yml. In the reat.yml I specified python=3.8 and pip=21.3.1 and in setup.py I changed the version of mikado to the fix_install branch. Reat was installed using pip install ./reat --no-cache-dir.
The inputs for mikado serialise were all created, but no mikado.db was made.
This is the error that the serialise.log provides:
1a.hisat.scallop_1a_SCLP.18757.5.0 0 2365 ID=1a.hisat.scallop_1a_SCLP.18757.5.0;coding=False 4.1 - 2180 2332 0 1 183 2181
2023-10-24 12:07:41,200 - Bed12ParseWrapper-19 - bed12.py:1871 - WARNING - run - Bed12ParseWrapper-19 - Invalid entry, reason: Invalid CDS length: 152 % 3 = 2 (1734-1885, 0)
1a.hisat.scallop_1a_SCLP.16680.10.0 0 1894 ID=1a.hisat.scallop_1a_SCLP.16680.10.0;coding=False 3.2 - 1733 1885 0 1 159 1734
2023-10-24 12:07:41,976 - serialise - orf.py:448 - INFO - __serialize_multiple_threads - MainProcess - Finished loading 57955 ORFs into the database
2023-10-24 12:07:42,358 - serialise - serialise.py:187 - INFO - load_orfs - MainProcess - Finished loading ORF data
2023-10-24 12:07:42,370 - serialise - serialise.py:142 - INFO - load_blast - MainProcess - Starting to load BLAST data
2023-10-24 12:07:42,371 - serialise - blast_serialiser.py:82 - INFO - __init__ - MainProcess - Number of dedicated workers: 40
2023-10-24 12:07:44,024 - serialise - blast_serialiser.py:249 - INFO - __serialize_targets - MainProcess - Started to serialise the targets
2023-10-24 12:07:45,101 - serialise - blast_serialiser.py:283 - INFO - __serialize_targets - MainProcess - Loaded 377931 objects into the "target" table
2023-10-24 12:07:45,124 - serialise - blast_serialiser.py:174 - INFO - __serialize_queries - MainProcess - Started to serialise the queries
2023-10-24 12:07:45,148 - serialise - blast_serialiser.py:226 - INFO - __serialize_queries - MainProcess - Loaded 0 objects into the "query" table
2023-10-24 12:07:45,151 - serialise - tab_serialiser.py:31 - INFO - _serialise_tabular - MainProcess - Creating a pool with 40 workers for analysing BLAST results
2023-10-24 12:07:46,058 - serialise - tabular_utils.py:431 - INFO - parse_tab_blast - MainProcess - Reading /data/elisa/spirogyra_genome/annotation/transcriptome_workflow/cromwell-executions/ei_annotation/ee7f86bc-60c8-42a6-8c0c-23b7dce589be/call-wf_main_mikado/wf_main_mikado/330e702b-9b00-4783-b796-f3aaf179c544/call-Mikado_short_and_long/wf_mikado/eb03adb7-86bf-45d0-bb05-054137e14b77/call-MikadoSerialise/inputs/746107790/mikado_diamond_homology.tsv data
2023-10-24 12:07:48,322 - serialise - serialise.py:388 - ERROR - serialise - MainProcess - Mikado crashed due to an error. Please check the logs for hints on the cause of the error; if it is a bug, please report it to https://github.com/EI-CoreBioinformatics/mikado/issues.
2023-10-24 12:07:48,322 - serialise - serialise.py:390 - ERROR - serialise - MainProcess - Cannot use a compiled regex as replacement pattern with regex=False
The workflow runs on the same machine using the same data with a version of reat (it's not mine and has some in-house patches so I cannot use the same) that was installed ~1 year ago, so the input files should be fine.
I managed to fix my installation by downgrading
pip install numpy==1.23.0 --no-cache-dirpip install sqlalchemy==1.4.38 --no-cache-dirpip install pandas==1.4.3 --no-cache-dir
Hello,
I am trying to run the transcriptome workflow of reat, but I am unfortunately getting stuck at the serialise step. I already had a look at the mikado issues page, but couldn't find a solution for me there. I installed reat in a conda environment using
mamba env create -f reat/reat.yml
. In the reat.yml I specified python=3.8 and pip=21.3.1 and in setup.py I changed the version of mikado to the fix_install branch. Reat was installed usingpip install ./reat --no-cache-dir
.The inputs for mikado serialise were all created, but no mikado.db was made.
This is the error that the serialise.log provides:
Here is the list of programs in my conda environment REAT_conda_env.txt
Here is the general reat log-file log.out.txt
The workflow runs on the same machine using the same data with a version of reat (it's not mine and has some in-house patches so I cannot use the same) that was installed ~1 year ago, so the input files should be fine.