Open leon945945 opened 3 days ago
Hello @leon945945 ,
meryl intersect
takes the count of the first meryl db.
I think you want the meryl difference
instead. It takes the kmer count of the first input db, which does not exist in the second db.
Merqury/trio/hapmers.sh does exactly what you are trying to do - and generates a histogram to visually confirm the hapmer spectrum. I call the parental specific kmers inherited to the child as hapmers.
best, Arang
Thanks for your reply, I will try to take use of merqury
.
And where is the wrong place of my analysis, firstly get the shared kmers between progeny and maternal, progeny and paternal, then merge the two kmer database.Thus, maternal kmers whose count equals to 0 is paternal specific kmers, paternal kmers whose count equals to 0 is maternal specific kmers.
It's my pleasure if you can give me some advances.
Hi, the purpose of my analysis is to get the paternal specific kmer, and to increase the specificity of kmer I chosed 62mer to perform analysis, because 72mer would lead to an error of
meryl
.At first, I generated the 62mer of
progeny
,maternal
andpaternal
with HiFi data forprogeny
andmaternal
, Illumina data forpaternal
.meryl count k=62 progeny.fq output progeny.meryl
meryl count k=62 maternal.fq output maternal.meryl
meryl count k=62 paternal.fq output paternal.meryl
Then, the maternal/paternal count of common kmer between progeny and maternal/paternal were outputed.
meryl intersect maternal.meryl/ progeny.meryl/ output mat-progeny_commonk62.meryl
meryl intersect paternal.meryl/ progeny.meryl/ output pat-progeny_commonk62.meryl
and printed the kmer count:meryl print mat-progeny_commonk62.meryl > mat-progeny.kmerCount
meryl print pat-progeny_commonk62.meryl > pat-progeny.kmerCount
Then, I merged the
mat-progeny.kmerCount
andpat-progeny.kmerCount
, and I thought the maternal kmers whose count equals to 0 were paternal specific kmer. I extract the sequences of paternal specifc kmers and aligned them to the maternal genome (assembled by the same HiFi file with kmer count) withbowtie2
, however, a large proportion of them could be aligned to maternal genome without mismatch or indel (CIGAR=62M), which means the paternal specific kmers could be find in maternal.Here is my confusion, the paternal specific kmer should not be existed in maternal, why they could be aligned in maternal genome. Hope for your reply, thanks very much.