tijeco / FUSTr

GNU General Public License v3.0
6 stars 3 forks source link

If you already have a protein reference set prepared #7

Open maxnest opened 3 years ago

maxnest commented 3 years ago

Hi, Thanks for your awesome pipeline, which is a very useful and user-friendly especially for those just starting out in bioinformatics. Could you please tell if you plan to add the ability to use not only fasta with nucleotide sequences, but also fasta with amino acid sequences as input? And what strategy do you think is preferable for those who already have prepared protein reference sets but want to conduct the analysis for which your pipeline is designed? Run separately the programs included in the pipeline with the parameters specified in the code? Thank you!

tijeco commented 3 years ago

@maxnest Sorry for the delay! I'm glad FUSTr has been of use to you. The analysis for selection used by FUSTr makes use of codon alignments, which requires nucleotide sequences. There are approaches that only use amino acid sequence data, but that has thus far been beyond the scope of what FUSTr aims to achieve. In the future, I may implement that. For now that is something that the user can do downstream. If they only have protein sequences (which I imagine is rarer than only having nucleotide sequence data, given the low cost of next-gen sequencing), the user would then be able to plug that into the pipeline.

If you would like to make a pull request to enable this feature, I would be more than happy to review it for you and help you implement it.