EI-CoreBioinformatics / minos

The labyrinth king judges your gene models.
GNU Lesser General Public License v3.0
9 stars 1 forks source link

Unable to build Mikado container #63

Open jacopoM28 opened 2 months ago

jacopoM28 commented 2 months ago

Dear Minos/Mikado developers,

I am trying to use Minos to improve a Braker-derived gene annotation; however, I am having trouble building the Singularity image of Mikado.

(base) [jmartelossi@delta Singularity]$git clone https://github.com/EI-CoreBioinformatics/mikado
(base) [jmartelossi@delta Singularity]$cd mikado/Singularity
(base) [jmartelossi@delta Singularity]$singularity build Singularity.centos.sif Singularity.centos.def
INFO:    User not listed in /etc/subuid, trying root-mapped namespace
INFO:    The %post section will be run under fakeroot
INFO:    Starting build...
Getting image source signatures
Copying blob fab18a2e65e5 skipped: already exists  
Copying blob 8ba884070f61 skipped: already exists  
Copying config a1dfe42ae9 done  
Writing manifest to image destination
Storing signatures
2024/10/01 11:09:04  info unpack layer: sha256:8ba884070f611d31cb2c42eddb691319dc9facf5e0ec67672fcfa135181ab3df
2024/10/01 11:09:06  info unpack layer: sha256:fab18a2e65e55f5d4c10a48e36e0e090345dacd84cf5eb1b6b2526f1a79fe0ad
INFO:    Running post scriptlet
+ export PYTHONDONTWRITEBYTECODE=true
+ PYTHONDONTWRITEBYTECODE=true
+ yum -y install git wget zlib1g-dev gcc gcc-c++
Loaded plugins: fastestmirror, ovl
Determining fastest mirrors
Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=container error was
14: curl#6 - "Could not resolve host: mirrorlist.centos.org; Unknown error"

 One of the configured repositories failed (Unknown),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Run the command with the repository temporarily disabled
            yum --disablerepo=<repoid> ...

     4. Disable the repository permanently, so yum won't use it by default. Yum
        will then just ignore the repository until you permanently enable it
        again or use --enablerepo for temporary usage:

            yum-config-manager --disable <repoid>
        or
            subscription-manager repos --disable=<repoid>

     5. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true

Cannot find a valid baseurl for repo: base/7/x86_64
+ git clone --depth 1 --branch master https://github.com/EI-CoreBioinformatics/mikado.git /usr/local/src/mikado
/.post.script: line 6: git: command not found
FATAL:   While performing build: while running engine: exit status 127

Any advice on what is happing here?

The Linux OS I am working on:

NAME="Rocky Linux"
VERSION="8.9 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.9"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.9 (Green Obsidian)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2029-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-8"
ROCKY_SUPPORT_PRODUCT_VERSION="8.9"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.9"
gemygk commented 2 months ago

Hi @jacopoM28

Thanks for reaching out.

Good timing 🙂, today morning I updated the installation instructions for Mikado, covering several methods (Docker, Singularity, Conda, and Mamba). You can follow the Mikado Singularity installation guide to build the image, which should address your needs.

Please let me know if you run into any issues.

Additionally, I will be updating the outdated Singularity definition files in our repositories soon to ensure everything is up-to-date.

Best, Gemy

jacopoM28 commented 2 months ago

Woa, what a coincidence! Lucky day for me :)

So, as far as I understand, to run Minos I need to build a Singularity image and specify the path with the --mikado-container option, right?

gemygk commented 2 months ago

Yes, you are correct, provide the path to the Mikado container as --mikado-container /path/to/mikado.sif when running the Minos pipeline.

jacopoM28 commented 2 months ago

Thank you very much for the quick answer. I think everything has been installed successfully now. However, I was running a test analysis and encountered this error:

(minos) [jmartelossi@delta Test_Minos]$ minos configure --mikado-container /beegfs/projects/bistagroup/softwares/mikado_singularity/mikado-2.3.5rc2.sif list.txt scoring_template.yaml qmProCava1.cleaned.primary.curated.softmasked.fa 
Starting MINOS V 1.8.0

Runmode is configure
Configuring run...
Traceback (most recent call last):
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/bin/minos", line 8, in <module>
    sys.exit(main())
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/lib/python3.7/site-packages/minos/__main__.py", line 166, in main
    MinosRunConfiguration(args).run()
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/lib/python3.7/site-packages/minos/minos_configure.py", line 144, in run
    self._generate_scoring_file(self.args)
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/lib/python3.7/site-packages/minos/minos_configure.py", line 70, in _generate_scoring_file
    self.smm = ScoringMetricsManager(args)
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/lib/python3.7/site-packages/minos/minos_scoring.py", line 108, in __init__
    args.external_metrics, use_tpm=args.use_tpm_for_picking
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/lib/python3.7/site-packages/minos/minos_scoring.py", line 45, in __importMetricsData
    for row in csv.reader(open(fn), delimiter="\t", quotechar='"'):
FileNotFoundError: [Errno 2] No such file or directory: ''

the list.txt file looks like this:

braker.gff3 braker  False   0   False
miniprot_representatives.gff    miniprot    False   0   False
transcripts_merged.gff  transcripts False   0   False

The scoring_template.yaml has been copied from the Minos directory.

Thank you again for your assistance.

Jacopo

swarbred commented 2 months ago

Dear @jacopoM28

From your posted command you are not providing --external-metrics i.e. https://github.com/EI-CoreBioinformatics/minos?tab=readme-ov-file#metrics-info which would be used to help score the models

I have attached a real file for example external_metrics.txt but not all metric classes need to be included (but at least one).

Below is a typical minos configure command for reference

source minos-1.9.0-dev1 && minos configure --mikado-container /ei/software/cb/mikado/2.3.3/x86_64/mikado-2.3.3_CBG.img -o Minos_run1 --external-metrics external_metrics.txt --external minos_mikado_external_config_plants_141023.yaml --genus-identifier Trdub --annotation-version EIv1.0 --busco-level p,g --busco-lineage /ei/public/databases/BUSCO/20230404/lineages/fabales_odb10 --busco-scoring 5 --use-diamond --use-tpm-for-picking list.txt minos_scoring_template_plants_141023.yaml --config-file minos_config.1.9.0-dev1_busco5.4.7_cbg_231023.yaml Inputs/Reference/GCA_951804385.1_drTriDubi3.1_genomic-clean_headers_seqkit.fa

I will also attach the template scoring files we commonly use with minor variations for general species type

minos_configs.zip

jacopoM28 commented 2 months ago

Hello,

Thank you very much for sharing the files and for helping me with running Minos. I am now encountering a new error:

(minos) [jmartelossi@delta Test_Minos]$minos configure --mikado-container /beegfs/projects/bistagroup/softwares/mikado_singularity/mikado-2.3.5rc2.sif -o Minos_run1 --external-metrics metrics.txt --external plant_external.yaml list.txt minos_scoring_template_plants_141023.yaml qmProCava1.cleaned.primary.curated.softmasked.fa --config-file minos_config.yaml 

Starting MINOS V 1.8.0

Runmode is configure
Configuring run...
Found expression metric: Kallisto_xx but --use-tpm-for-picking was not set. Commenting it on scoring file.
Generating scoring file Minos_run1/minos_run.scoring.yaml ... done.
Generating minos run configuration file Minos_run1/minos_run.run_config.yaml ... done.
Generating mikado configuration file Minos_run1/minos_run.mikado_config.yaml ...Traceback (most recent call last):
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/bin/minos", line 8, in <module>
    sys.exit(main())
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/lib/python3.7/site-packages/minos/__main__.py", line 166, in main
    MinosRunConfiguration(args).run()
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/lib/python3.7/site-packages/minos/minos_configure.py", line 146, in run
    self._run_mikado_configure(self.args)
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/lib/python3.7/site-packages/minos/minos_configure.py", line 63, in _run_mikado_configure
    out = subprocess.check_output(cmd, shell=True, stderr=subprocess.STDOUT)
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/beegfs/projects/bistagroup/miniforge3/envs/minos/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'singularity exec /beegfs/projects/bistagroup/softwares/mikado_singularity/mikado-2.3.5rc2.sif mikado configure --codon-table 0 --list list.txt  --external plant_external.yaml  -od Minos_run1 --reference qmProCava1.cleaned.primary.curated.softmasked.fa --scoring Minos_run1/minos_run.scoring.yaml  Minos_run1/minos_run.mikado_config.yaml --full' returned non-zero exit status 1.

If I run the Mikado command that causes the error:

(minos) [jmartelossi@delta Test_Minos]$singularity exec /beegfs/projects/bistagroup/softwares/mikado_singularity/mikado-2.3.5rc2.sif mikado configure --codon-table 0 --list list.txt  --external plant_external.yaml  -od Minos_run1 --reference qmProCava1.cleaned.primary.curated.softmasked.fa --scoring Minos_run1/minos_run.scoring.yaml  Minos_run1/minos_run.mikado_config.yaml --full

2024-10-02 10:04:22,110 - to_json - configurator.py:194 - CRITICAL - load_and_validate_config - MainProcess - The configuration file is invalid. 
Validation errors if a Daijin configuration file was expected:
{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}
Validation errors if a Mikado configuration file was expected:
{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}
2024-10-02 10:04:22,110 - to_json - configurator.py:207 - ERROR - load_and_validate_config - MainProcess - Loading the configuration file failed with error:
The configuration file is invalid. 
Validation errors if a Daijin configuration file was expected:
{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}
Validation errors if a Mikado configuration file was expected:
{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}

Traceback (most recent call last):
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/Mikado/configuration/configurator.py", line 183, in load_and_validate_config
    config = MikadoConfiguration.Schema().load(config, partial=external)
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow_dataclass/__init__.py", line 639, in load
    all_loaded = super().load(data, many=many, **kwargs)
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow/schema.py", line 719, in load
    return self._do_load(
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow/schema.py", line 904, in _do_load
    raise exc
marshmallow.exceptions.ValidationError: {'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/Mikado/configuration/configurator.py", line 186, in load_and_validate_config
    config = DaijinConfiguration.Schema().load(config, partial=external)
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow_dataclass/__init__.py", line 639, in load
    all_loaded = super().load(data, many=many, **kwargs)
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow/schema.py", line 719, in load
    return self._do_load(
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow/schema.py", line 904, in _do_load
    raise exc
marshmallow.exceptions.ValidationError: {'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/Mikado/configuration/configurator.py", line 195, in load_and_validate_config
    raise exc
marshmallow.exceptions.ValidationError: The configuration file is invalid. 
Validation errors if a Daijin configuration file was expected:
{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}
Validation errors if a Mikado configuration file was expected:
{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}
Mikado crashed, cause:
"The configuration file passed is invalid. Please double check. Exception: The configuration file is invalid. \nValidation errors if a Daijin configuration file was expected:\n{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}\nValidation errors if a Mikado configuration file was expected:\n{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}"
Traceback (most recent call last):
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/Mikado/configuration/configurator.py", line 183, in load_and_validate_config
    config = MikadoConfiguration.Schema().load(config, partial=external)
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow_dataclass/__init__.py", line 639, in load
    all_loaded = super().load(data, many=many, **kwargs)
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow/schema.py", line 719, in load
    return self._do_load(
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow/schema.py", line 904, in _do_load
    raise exc
marshmallow.exceptions.ValidationError: {'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/Mikado/configuration/configurator.py", line 186, in load_and_validate_config
    config = DaijinConfiguration.Schema().load(config, partial=external)
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow_dataclass/__init__.py", line 639, in load
    all_loaded = super().load(data, many=many, **kwargs)
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow/schema.py", line 719, in load
    return self._do_load(
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/marshmallow/schema.py", line 904, in _do_load
    raise exc
marshmallow.exceptions.ValidationError: {'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/Mikado/configuration/configurator.py", line 195, in load_and_validate_config
    raise exc
marshmallow.exceptions.ValidationError: The configuration file is invalid. 
Validation errors if a Daijin configuration file was expected:
{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}
Validation errors if a Mikado configuration file was expected:
{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/Mikado/__main__.py", line 68, in main
    args.func(args)
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/Mikado/subprograms/configure.py", line 89, in create_config
    other = dataclasses.asdict(load_and_validate_config(args.external, external=True))
  File "/opt/conda/envs/mikado_env/lib/python3.9/site-packages/Mikado/configuration/configurator.py", line 208, in load_and_validate_config
    raise InvalidConfiguration(f"The configuration file passed is invalid. Please double check. Exception: {exc}")
Mikado.exceptions.InvalidConfiguration: "The configuration file passed is invalid. Please double check. Exception: The configuration file is invalid. \nValidation errors if a Daijin configuration file was expected:\n{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}\nValidation errors if a Mikado configuration file was expected:\n{'scoring': {'selected_cds_num': ['Unknown field.'], 'three_utr_num': ['Unknown field.'], 'five_utr_num': ['Unknown field.'], 'number_internal_orfs': ['Unknown field.'], 'blast_score': ['Unknown field.'], 'three_utr_length': ['Unknown field.'], 'retained_intron_num': ['Unknown field.'], 'non_verified_introns_num': ['Unknown field.'], 'cdna_length': ['Unknown field.'], 'cds_not_maximal': ['Unknown field.'], 'end_distance_from_junction': ['Unknown field.'], 'selected_cds_fraction': ['Unknown field.'], 'intron_fraction': ['Unknown field.'], 'external.tpm': ['Unknown field.'], 'exon_num': ['Unknown field.'], 'highest_cds_exon_number': ['Unknown field.'], 'five_utr_length': ['Unknown field.'], 'proportion_verified_introns_inlocus': ['Unknown field.'], 'selected_cds_length': ['Unknown field.'], 'cds_not_maximal_fraction': ['Unknown field.'], 'retained_fraction': ['Unknown field.'], 'selected_cds_intron_fraction': ['Unknown field.'], 'combined_cds_locus_fraction': ['Unknown field.'], 'is_complete': ['Unknown field.']}, 'not_fragmentary': ['Unknown field.'], 'as_requirements': ['Unknown field.'], 'requirements': ['Unknown field.']}"

I suppose there is something wrong with the minos_config.yaml, but what I have done so far is just copy the one provided in minos/etc/minos_config.yaml and remove all the paths under the program_calls section since all the softwares are installed in the Minos conda environment or loaded as modules. I am attaching the minos_config.yaml file (as a .txt file for GitHub compatibility).

minos_config.yaml.txt

Thank you again and all the best, Jacopo

swarbred commented 1 month ago

Hi @jacopoM28 it wont be an issue with the minos_config.yaml as that is not being used in the mikado configure command. You actually dont need to set --config-file minos_config.yaml unless there are changes to the default config that you need to introduce. Following your minos configure command you should create three files

  1. minos_run.scoring.yaml - this is the scoring file created based on your provided template file minos_scoring_template_plants_141023.yaml
  2. minos_run.run_config.yaml - this is the minos config file and incoporates info from any --config-file provided

Both those files were created successfully and its the mikado configure step that generates the third file that fails. The error indicates it's an issue with the scoring file.

Can you check the Minos_run1/minos_run.scoring.yaml and see that this looks ok (the error report indicates it was generated ok, attach to this ticket.

just to debug can you rerun the mikado configure command but this time exclude the --external plant_external.yaml

singularity exec /beegfs/projects/bistagroup/softwares/mikado_singularity/mikado-2.3.5rc2.sif mikado configure --codon-table 0 --list list.txt -od Minos_run1 --reference qmProCava1.cleaned.primary.curated.softmasked.fa --scoring Minos_run1/minos_run.scoring.yaml Minos_run1/minos_run.mikado_config.yaml --full

and also try with changing --scoring Minos_run1/minos_run.scoring.yaml to the template scoring file i.e. minos_scoring_template_plants_141023.yaml

then let us know

jacopoM28 commented 1 month ago

Dear @swarbred,

I managed to run the Mikado config excluding the --external flag both with the --scoring Minos_run1/minos_run.scoring.yaml and --scoring minos_scoring_template_plants_141023.yaml options. Do you have any idea why the file is not working?

Please let me know if you need any further information from my side. Thank you very much for your support, I really appreciate it.

All the best, Jacopo

swarbred commented 1 month ago

Hi @jacopoM28

The plant_external.yaml file you are using is that the minos_mikado_external_config_plants_141023.yaml file I attached just renamed or did you edit this?

jacopoM28 commented 1 month ago

Yes, indeed, the one you provided is working, but the one specified in the Minos manual sample_data/plant_external.yaml is not.

Now I'll try to continue with the pipeline and will let you know if any other errors occur.

Thank you again, Jacopo

jacopoM28 commented 1 month ago

Dear @swarbred,

unfortunately now I have some problems with minos run command. I am attaching complete log file.

Jacopo 2024-10-08T122843.216363.snakemake.log

swarbred commented 1 month ago

Regarding

Yes, indeed, the one you provided is working, but the one specified in the Minos manual sample_data/plant_external.yaml is not.

That is a mikado scoring file not the mikado config file which is what providing --external expects (it's not the best named given how we have named options in minos)

swarbred commented 1 month ago

@jacopoM28 from a quick look can you check Minos_run3/tx2gene/ and see if the transcript to gene mapping files were created. The error reported links to running busco. Can you post your minos configure command? I will send you an example set of data and you can see if that runs correctly (it's a region of ~100 genes so will be quick to run).

swarbred commented 1 month ago

Down load the example data from https://drive.google.com/drive/folders/1gb5CzNWxu7tZ_VkrsmsQ3Yphb_mUEilO?usp=drive_link

And run similar to Minos/Chr3-1065466-1464870/commands.txt (modified to point to your mikado container). This data is the same as for our annotation workshop see https://github.com/EI-CoreBioinformatics/annotation-workshop-2024/wiki/Minos#minos-commands

You can then rerun with the busco options i.e. as https://github.com/EI-CoreBioinformatics/annotation-workshop-2024/wiki/Minos#commands-with-busco

If you then rerun with the busco genome run i.e. option g this will require normal resources as the full Chr3 is provided

swarbred commented 1 month ago

If you can run with the example data then it will be your inputs and if Minos_run3/tx2gene/ is not correct (the log suggests this was attempted to run multiple times) then review the input GFFs (though Braker output should work ok)

jacopoM28 commented 1 month ago

Thank you for providing the toy dataset and the tutorial. Unfortunately, I tried running it with your example dataset, but I still can't get Minos to work. Could this error AttributeError: module 'pulp' has no attribute 'list_solvers' be the reason?

I apologize for this long thread and for bothering you with all the errors, but I think Minos is a really promising tool, and I would really like to use it. I am attaching the log file. minos.run.My_minos_run.log

Best, Jacopo

gemygk commented 1 month ago

Hi @jacopoM28 It looks like you have a more recent version of pulp installed for the Snakemake version you are using. Can you pin the pulp version to pulp==2.7.0 and run the commands again, please?

For future reference, can you also add the version for Snakemake you are using.