Open hnlixuanji opened 12 months ago
Hi XJ
Thank you for your interest in our project.
We plan to release a web-server. That's why our github sample code and google colab version is for 1 protein sequence at a time and the batch processing will be performed in the server later. For the time being for multiple proteins, we suggest writing a bash script pointing to the fasta files sequentially.
We have various scripts for batch processing, used during our experiments (since they are nor properly cleaned and refactored, they were not released). However, if you have any particular specification of how your input is and how you would like the output to be, I can share some scripts accordingly.
Dear nibtehaz Thank you very much for your reply. Our input is a file containing a catalog of all non-redundant genes (starting with ">gene name"). I need to convert all genes to protein sequences before using your tool. I would like our output to be a CSV file containing all the genes with column names "gene_name", "GO_MF", "MF_definination", "Go term", "Confidence", "GO_BP", "BP_definination", "Go term", " Confidence", "GO_CC", "CC_definination", "Go term", "Confidence". Or maybe you have other better ideas or scripts to show all the genes.
BTW, I have an open question :-) have you tried to or plan to integrate Alpha-Fold into the function prediction?
Best, XJ
Hi XJ
Sure, I can prepare a script like that. The input will be a large fasta file right?
We have plans to use structure from AlphaFold in protein function prediction. But at this moment we are not actively pursuing that.
Yes, it is a large fast file. Thank you a lot!
Dear Author.
Thank you for your contribution to protein function prediction. According to your paper, Domain-PFP performs well against many of the latest tools. I am considering using your tool to assign functions to multiple protein sequences (tens of thousands) in a fasta file. The highest confidence MF, BP, CC will be selected for each sequence. so I was wondering if you have developed this script so I don't have to repeat it again :-)
Best, XJ