eead-csic-compbio / get_homologues

GET_HOMOLOGUES: a versatile software package for pan-genome analysis
Other
109 stars 26 forks source link

File Output? #103

Closed TommyH-Tran closed 2 years ago

TommyH-Tran commented 2 years ago

Is there a file that will identify the the core, soft core, shell, and cloud genes per genome? I know in the pangenome matrix analysis it outputs it in the terminal, but I was wondering if it outputs a file with this information and what the name is?

-T

eead-csic-compbio commented 2 years ago

Hi @TommyH-Tran , the script parse_pangenome_matrix.pl -s can produce 4 files with clusters in the cloud, shell, soft-core and core compartments, as explained here.

Now, those files contain cluster names, you will have to do extra work to get the actual genes from a genome of interest, for example in the command line; please let us know if you need with that.

Another approach would be to check the pangenome_matrix_genes_t0.tab matrix produced by script compare_clusters.pl -m, which actually contains the gene names in each cluster, as explained in the latest tutorial

Hope this helps, Bruno

eead-csic-compbio commented 2 years ago

I have added this to the EST manual

TommyH-Tran commented 2 years ago

@eead-csic-compbio thank you for your clarification Bruno