cultivarium / MicrobeMod

A toolkit for exploring prokaryotic methylation and base modifications in nanopore sequencing
MIT License
34 stars 1 forks source link

strain comparison #21

Closed stevebaeyen closed 3 months ago

stevebaeyen commented 4 months ago

I will try this out on a collection of plant pathogenic bacteria, looks very interesting (and I already have native ONT data). Is there a default tool or workflow recommended to perform comparisons of the call_methylation script on a collection of strains? Or how can I proceed with the output? Thanks in advance for your advice!

alexcritschristoph commented 4 months ago

I'd suggest writing a loop to process each of our strains: assemble, map to the assemblies, and then run MicrobeMod on the mappings.

You can then read in a lot of *motifs.tsv files like this in Python

import glob
import pandas as pd 
i = 0 
for fn in glob.glob('./*motifs.tsv'):
    d = pd.read_csv(fn, sep="\t")
    d['Strain'] = fn.split("/")[-1].split("_motifs.tsv")[0]
    if i == 0:
        motifs = d
        i += 1
    else:
        motifs = pd.concat([motifs,d])