Open Ge0rges opened 3 weeks ago
Are you using the bin-motifs.tsv
file as input to MTase-linker? The motif.tsv
and motif-scored.tsv
does not work with the pipeline.
Ah yes, I am using motifs.tsv
. I can switch to using bin-motifs.tsv
. I am conducting this analysis within the context of a single genome so get mislead by the bin
prefix.
What is the difference between those files? The output section doesn't include that information yet.
Also, please make sure to use the newest version of Nanomotif (v. 0.1.15) as it resolves issues present in previous versions
The difference between motifs.tsv and bin_motifs.tsv lies in a series of post-processing steps applied to generate a consensus set of motifs across contigs for each bin (genome in your case). Some contigs will not have the motifs in the sequence, and other contigs might show slight variation in the motif compared to the rest of the bin due to noise or just the context in which the motifs is observed. To account for this, we apply post-processing to find consensus motifs across a whole genome and output this in bin-motifs.tsv. <
If you want to find motifs in single genome bin-motifs.tsv is your go to file. motifs.tsv and score-motifs.tsv are more relevant in regard to binning.
For more details on these post-processing steps, refer to supplementary note 1 of our preprint: https://www.biorxiv.org/content/10.1101/2024.04.29.591623v1
Got it. Regarding the version did you mean v0.4.15? That's indicated both on the PyPi page and the your conda meta.yaml
. My installation defaulted to that. When I forced pip to install 0.1.15
I got a version of nanomotif with slightly different commands I believe including complete-workflow
which I think wasn't present in the previous version I had installed.
Yes, sorry about the confusion. I meant v0.4.15.
Getting a different error now on latest version with correct file input.
Select jobs to execute...
[Tue Oct 22 11:35:25 2024]
rule motif_assignment:
input: nanomotif/brevundimonas_r-contigs/mtase-linker/pfam_hmm_hits/brevundimonas_r-contigs_gene_id_mod_table.tsv, nanomotif/brevundimonas_r-contigs/mtase-linker/defensefinder/brevundimonas_r-contigs_processed_defense_finder_mtase.tsv, nanomotif/brevundimonas_r-contigs/mtase-linker/blastp/brevundimonas_r-contigs_rebase_mtase_sign_alignment.tsv, nanomotif/brevundimonas_r-contigs/bin-motifs.tsv, /localdata/researchdrive/gkanaan/seaice_methylation/nanomotif/contig_bin.tsv
output: nanomotif/brevundimonas_r-contigs/mtase-linker/mtase_assignment_table.tsv, nanomotif/brevundimonas_r-contigs/mtase-linker/nanomotif_assignment_table.tsv
jobid: 1
reason: Missing output files: nanomotif/brevundimonas_r-contigs/mtase-linker/mtase_assignment_table.tsv; Input files updated by another job: nanomotif/brevundimonas_r-contigs/mtase-linker/pfam_hmm_hits/brevundimonas_r-contigs_gene_id_mod_table.tsv, nanomotif/brevundimonas_r-contigs/mtase-linker/defensefinder/brevundimonas_r-contigs_processed_defense_finder_mtase.tsv, nanomotif/brevundimonas_r-contigs/mtase-linker/blastp/brevundimonas_r-contigs_rebase_mtase_sign_alignment.tsv
resources: tmpdir=/tmp
Activating conda environment: ../../../../researchdrive/gkanaan/tools/ML_dependencies/ML_envs/71dd0a79701938f24ea6c2c3e756d4dc_
Activating conda environment: ../../../../researchdrive/gkanaan/tools/ML_dependencies/ML_envs/71dd0a79701938f24ea6c2c3e756d4dc_
Traceback (most recent call last):
File "/localdata/researchdrive/gkanaan/seaice_methylation/.snakemake/scripts/tmpnqxvjl1f.motif_assignment.py", line 103, in <module>
nanomotif_table_mm50.loc[:,'linked'] = False
~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
File "/researchdrive/gkanaan/tools/ML_dependencies/ML_envs/71dd0a79701938f24ea6c2c3e756d4dc_/lib/python3.12/site-packages/pandas/core/indexing.py", line 885, in __setitem__
iloc._setitem_with_indexer(indexer, value, self.name)
File "/researchdrive/gkanaan/tools/ML_dependencies/ML_envs/71dd0a79701938f24ea6c2c3e756d4dc_/lib/python3.12/site-packages/pandas/core/indexing.py", line 1809, in _setitem_with_indexer
raise ValueError(
ValueError: cannot set a frame with no defined index and a scalar
[Tue Oct 22 11:35:28 2024]
Error in rule motif_assignment:
jobid: 1
input: nanomotif/brevundimonas_r-contigs/mtase-linker/pfam_hmm_hits/brevundimonas_r-contigs_gene_id_mod_table.tsv, nanomotif/brevundimonas_r-contigs/mtase-linker/defensefinder/brevundimonas_r-contigs_processed_defense_finder_mtase.tsv, nanomotif/brevundimonas_r-contigs/mtase-linker/blastp/brevundimonas_r-contigs_rebase_mtase_sign_alignment.tsv, nanomotif/brevundimonas_r-contigs/bin-motifs.tsv, /localdata/researchdrive/gkanaan/seaice_methylation/nanomotif/contig_bin.tsv
output: nanomotif/brevundimonas_r-contigs/mtase-linker/mtase_assignment_table.tsv, nanomotif/brevundimonas_r-contigs/mtase-linker/nanomotif_assignment_table.tsv
conda-env: /researchdrive/gkanaan/tools/ML_dependencies/ML_envs/71dd0a79701938f24ea6c2c3e756d4dc_
RuleException:
CalledProcessError in file /Accounts/gkanaan/miniconda3/nanomotif/lib/python3.9/site-packages/nanomotif/mtase_linker/MTase_linker.smk, line 197:
Command 'source /Accounts/gkanaan/anaconda3/bin/activate '/researchdrive/gkanaan/tools/ML_dependencies/ML_envs/71dd0a79701938f24ea6c2c3e756d4dc_'; set -euo pipefail; python /localdata/researchdrive/gkanaan/seaice_methylation/.snakemake/scripts/tmpnqxvjl1f.motif_assignment.py' returned non-zero exit status 1.
File "/Accounts/gkanaan/miniconda3/nanomotif/lib/python3.9/site-packages/nanomotif/mtase_linker/MTase_linker.smk", line 197, in __rule_motif_assignment
File "/Accounts/gkanaan/miniconda3/nanomotif/lib/python3.9/concurrent/futures/thread.py", line 58, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-10-22T113315.257866.snakemake.log
MTase-linker failed with error: Command '['snakemake', '--snakefile', '/Accounts/gkanaan/miniconda3/nanomotif/lib/python3.9/site-packages/nanomotif/mtase_linker/MTase_linker.smk', '--cores', '20', '--config', 'THREADS=20', 'ASSEMBLY=/localdata/researchdrive/gkanaan/seaice_methylation/mags//brevundimonas_r-contigs.fna', 'CONTIG_BIN=/localdata/researchdrive/gkanaan/seaice_methylation/nanomotif/contig_bin.tsv', 'OUTPUTDIRECTORY=nanomotif/brevundimonas_r-contigs/mtase-linker', 'DEPENDENCY_PATH=/researchdrive/gkanaan/tools/ML_dependencies', 'IDENTITY=80', 'QCOVS=80', 'NANOMOTIF=nanomotif/brevundimonas_r-contigs/bin-motifs.tsv', '--use-conda', '--conda-prefix', '/researchdrive/gkanaan/tools/ML_dependencies/ML_envs']' returned non-zero exit status 1.
Can you provide the bin-motifs.tsv you are using?
Here it is:
bin mod_type motif mod_position n_mod_bin n_nomod_bin motif_type motif_complement mod_position_complement n_mod_complement n_nomod_complement
brevundimonas_r-contigs m GGCGCC 2 130 159 palindrome GGCGCC 2 130 159
metagenome_assembly m GGCGCC 2 130 159 palindrome GGCGCC 2 130 159
The error arises from a filtering step in the motif assignment process. MTase-linker only assigns motifs that are methylated in more than 50% of their occurrences across the entire genome. This is defined by the formula:
n_mod_bin / (n_mod_bin + n_nomod_bin) > 0.5
From literature (Beaulaurier 2019), we know that if a methylation motif is targeted by an MTase, typically >95% of motif occurrences are methylated. This is the reason why we choose this threshold of 50%.
In your case, the two motifs have a methylation level below this threshold. As a result, MTase-linker filters these motifs out and attempts to assign an empty table, leading to the error. Thus, currently MTase-linker does not support the assignment of these motifs. Would you be interested in a configurable flag that could adjust this threshold?
It would be interesting to filter the modkit pileup for methylations related to the motif, and then make a similar plot to the ones in figure S8 of the Nanomotif article. I guess you would see something like the middle plot for figure S8.
You might also consider adjusting the --threshold_methylation_general, which determines whether a positions is seen as methylated or not.
For further details, you might find this previous discussion helpful: link to issue #60.
Hi @JSBoejer , that would be a good flag to have. Generating something like S8 would indeed be interesting! Thanks.
Hello,
Wanted to share the following error obtained when running MTase-linker.