YiyanYang0728 / Vibrio_biofilm_matrix_cluster

Scripts and pipeline for Vibrio biofilm matrix cluster project
0 stars 0 forks source link

Question on Pulling DNAs of specific protein (vpsCG) from indexes annotated by ProkFunFind #1

Open khuynh012 opened 3 days ago

khuynh012 commented 3 days ago

Hi Yang Lab,

Sorry for committing an issue on your repository - this is my first time communicating on GitHub and I still don't know the customs here. Anyhow, hi! I'm Yun, an undergraduate from Olson Lab - your collaborator from Wesleyan University. For my project I was looking to pull genes coding for a specific protein (vpsG) from all species of Vibrio. Prof Olson directed me to your recently reviewed paper and I have been working through your annotated biofilm cluster supplemental file to pull the genes by indexes. When I aligned the genes with vpsG from a specific Vibrio species, nothing was aligned. I tried again with RbmB, using your RbmB amino acid sequence (from your supplemental data) to verify if we pulled the right gene sequence from the indexes and still, nothing came up. I was wondering if I forgot to make adjustments to the indexes before pulling the sequence? Do you guys have any advise?

Warmest regards, Yun

YiyanYang0728 commented 2 days ago

Hi Yang Lab,

Sorry for committing an issue on your repository - this is my first time communicating on GitHub and I still don't know the customs here. Anyhow, hi! I'm Yun, an undergraduate from Olson Lab - your collaborator from Wesleyan University. For my project I was looking to pull genes coding for a specific protein (vpsG) from all species of Vibrio. Prof Olson directed me to your recently reviewed paper and I have been working through your annotated biofilm cluster supplemental file to pull the genes by indexes. When I aligned the genes with vpsG from a specific Vibrio species, nothing was aligned. I tried again with RbmB, using your RbmB amino acid sequence (from your supplemental data) to verify if we pulled the right gene sequence from the indexes and still, nothing came up. I was wondering if I forgot to make adjustments to the indexes before pulling the sequence? Do you guys have any advise?

Warmest regards, Yun

Hi Yun,

Thank you for reaching out. We've re-annotated all the Vibrio genomes on our end, and the gene indices are likely different from those in NCBI or other databases. However, if you use BLASTp to compare the RbmB sequences provided in the paper against your genomes, you should definitely get some results. The gene indices may not match, so I suggest starting with BLASTping the RbmB protein sequences to see if this method works for your genomes.

If that works, please feel free to reach out to me at yiyan.yang@nih.gov, and I'll be happy to provide you with the potential vspG protein sequences. Does this sound a good plan for you?

Best regards, Yiyan