elizabethmcd / metabolisHMM

Tool for constructing phylogenies and summarizing metabolic characteristics based on curated and custom profile HMMs
GNU General Public License v3.0
17 stars 5 forks source link

Fasta formating #52

Closed Streptomyces1 closed 3 years ago

Streptomyces1 commented 4 years ago

Dear All, I´m heaving an issue on the very beginning of the analyzes: While trying to run the summarize-metabolism function I have the following message: "These do not look like fasta files that end in .fna or .faa. Please check your genome files."

I´ve being trying to upload some MAG´s I isolated from a metagenome, they were originally named as ".fa" files. I tried to rename than but had no success. Also tried to run and change the name of reference genomes I found online but had the same issue. I have the felling that´s a very simple thing to fix but I´m still a beginner so I´d be happy to have some help to overcome it.

All the best, Yuri.

elizabethmcd commented 4 years ago

Hi Yuri, Do you have other files in your directory that aren't fasta files? The directory can only contain fasta files that end in .fna or .faa, and they all need to be the same (so either all nucleotide or all protein files). To rename files, try this for loop if you are changing them to .fna: for file in *.fa; do name=$(basename $file .fa).fna; mv $file $name; done Let me know if that helps.

Streptomyces1 commented 3 years ago

Hey, it worked! I was able to generate the heatmap. One last question: I saw on your twitter some nice pictures about the software where you had an subtitle to the graph showing the colors of each metabolism. Did you ad this afterwards or there is an option to insert than straight on the command line?

All the best, Yuri.

elizabethmcd commented 3 years ago

I had to insert these manually - I can't remember if the workflow automatically takes them out or leaves them all there. But if you look in the code for that workflow the matplotlib/seaborn code probably just has to be modified slightly to get the legends.