Open Sasparle opened 2 years ago
Hi @Sasparle,
Yes, you're onto something. The methylation binning method is designed to work on microbiome and not on single bacteria. The few number of contigs (n=3 for you) makes the binning impossible with our approach.
From your post, I can't see what is your goal. If you want to find methylation motifs I would suggest following the "Individual bacteria" tutorial. Once you have the set of methylated motifs, you can look at the methylation profile with the corresponding tutorial to check the status of motifs in the phages. Please keep in mind that nanodisco
is not suited for single event methylation detection (see Q7)
Best,
Alan
Hi, I am new to nanodisco and methylation profile analysis in general. I have come across the same issue multiple times while trying to perform methylation binning on a nanopore sequenced dataset composed of 1 bacterias and 2 phages. I've been following the detailed tutorial on the website as indicated and everything seems to go smoothly until I reach the binning step with
nanodisco binning
, where systematically come across this error with both datasets (I used automated profile matrix method):What I find to be very odd is that the tsne matrix made by nanodisco only contains 3 distinct contigs for 123 motifs. Knowing that my filtered matrix from
nanodisco filter_profile
is of dimension 328 x 6 with the same 3 unique "contigs", it makes sense for the tsne matrix to be so small. However, when I compare my filtered profile matrix to the one in the example dataset, I notice that the example dataset contains WAY more contigs than mine (2905 unique contigs). Could it be a problem with the way my input files are treated which leads to all reads from a sample to be all placed within the same "contig" in the filtered profile matrix? Does that mean it's not possible to perform methylation binning on a single bacterial sample?I also tried to lower the
--tsne_perplexity
parameter, but it doesn't change anything even when I put it to 1.Finally, here is the command I used (dataset is named
all
):Has anybody a clue of what could go wrong in the process ? I've been looking into this issues for hours and have yet to find how to make my data compatible with the binning process.
Thanks a lot!