Prunoideae / MitoFlex

A mitogenome toolkit inspired by MitoZ, while being more effective, precise and flexible.
GNU General Public License v3.0
18 stars 5 forks source link

The result from nhmmer/filter_taxanomy is empty #12

Closed lijunyuan closed 1 year ago

lijunyuan commented 1 year ago

Dears, I am using mitoflex to assemble coral mitochondrial genome, which belongs to the Cnidaria. However, the program can not find the mitochondrial genome from the assemblied sequences, with reports"MitoFlex : The result from nhmmer/filter_taxanomy is empty! Please check if the data is unqualified, or a wrong taxanomy class is given! ". When I ran the program, the --clade was set as Anthropoda. The same parameter of Anthropoda was used in MitoZ, and in some cases, the mitochondrial genome can be selected, however, it can not work in Mitoflex even with the same raw data used, could you give some suggestion?

Prunoideae commented 1 year ago

Maybe you can try to add --depth-list 10,10,10,10,10,20,20 to loosen the depth filter, or even set it to --depth-list 0,0,0,0,0,0,0. Most of the time it should work, MEGAHIT should have some result.

MitoFlex normally expects the input data to be abundant in mitogenomic reads, in order to remove the noise or unwanted sequences effectively, a depth filter is introduced to remove intermediate contigs with depth lower than each step in each iteration of MEGAHIT's assembly. This speeds up the process of assembling much and can make the contigs output more precise. This works because mitogenome usually has a much higher depth than other sequences in samples, because one single cell can have hundreds of mitochondria.

The default setting might not work well in some cases if your data is not ideal (or even metagenomic), or the overall sequencing depth is just low, then overriding the --depth-list should work as it removes the filtering of data.

lijunyuan commented 1 year ago

Yes, the content of mitochondria is relatively lower in corals compared with other higher animals, and I used both the parameters you recommended.However, even using the --depth-list 0,0,0,0,0,0,0, there is still error report as "The result from nhmmer/filter_taxanomy is empty". And the .result folder is empty. Whereas, in the tepm/assemble/*result/, the scaf.fa file contained the mitochondrial genome. I know I can extract my MT genome from the scaf.fa using blast, but it is not as convenient as the scenario if the pineline can run smoothly. So could give me further instructions?

Prunoideae commented 1 year ago

Then you will have to add --disable-taxa to completely disable the filter. This is interesting, I thought this pipeline should outperform SOAPdenovo in most of the cases, maybe it's a special case of MEGAHIT, since it's an metagenomic assembler.

lijunyuan commented 1 year ago

With --disable-taxa used, the mitochondrial genome has been picked.