Dee-chen / Tree2gd

GNU General Public License v3.0
34 stars 7 forks source link

A little advice #9

Open lfp-a opened 1 year ago

lfp-a commented 1 year ago

In the clustering process, some groups may have a large number of sequences, which do not have much impact on the whole analysis, but will occupy a large amount of computing resources. Can you set a parameter to limit the maximum number of sequences in the group?

Dee-chen commented 1 year ago

Thank you for your suggestion. I will consider adding this parameter in subsequent versions. In the current version, the clustering results of the mcl gene family in step 2 have been sorted according to the number of genes included. You can filter manually, and then automatically run the following steps through the -- step 3456 parameter. In our laboratory research experience, the analysis of genome wide duplication events requires a lot of manual filtering and screening of gene families and homologous genes. So I designed tree2gd to run independently in 6 steps with good compatibility.

lfp-a commented 1 year ago

Thank you for your answer. I will try to break down the steps