hezhaobin commented 4 years ago

@lindseyfaye @rsmoak, let's put together the results we have so far into a PPT format. Perhaps one slide per result + any key method information. I'm currently working on the OrthoMCL results, and plan on including the CATH and Pfam in my global analyses. I will need 5-7 days to get these done. Will either of you be able to put together a 20-30 min presentation for next week? I can do Monday any time, Tuesday afternoon, Wednesday morning or Thursday morning. @janfassler, do you have time constraints?

lindseyfaye commented 4 years ago

I can go first, my preference would be for Thursday morning. Does 10 work for everyone?

rsmoak commented 4 years ago

10 Central time on Thursday 05/28/2020 works for me

janfassler commented 4 years ago

My lab meeting is Thursdays at 11:15, so 10 could work, but I'll have to leave promptly. -Jan

hezhaobin commented 4 years ago

Shall we start at 9:30 central time? If 10 works better for all of us, we can also do that. I think one hour is a good chunk of time to discuss. -- Bin

On Fri, May 22, 2020 at 4:08 PM janfassler notifications@github.com wrote:

My lab meeting is Thursdays at 11:15, so 10 could work, but I'll have to leave promptly. -Jan

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/binhe-lab/C037-Cand-auris-adhesin/issues/4#issuecomment-632909469, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFMLPWS7E56S4JMC23M6LRS3SV7ANCNFSM4NHAW3CA .

-- Sincerely yours Bin

hezhaobin commented 4 years ago

@lindseyfaye can you share the slides you presented today here? @rsmoak, want to see your tips on MEME and MAST.

hezhaobin commented 4 years ago

@lindseyfaye , can you add the alignment to the 02/data folder?

lindseyfaye commented 4 years ago

Here's the link to my slides: https://docs.google.com/presentation/d/1fRR33Fl5jp104yPFbkp1ihP3Lok_CGk2DdpLRQIME0I/edit?usp=sharing

rsmoak commented 4 years ago

The link to my slides: https://docs.google.com/presentation/d/1sFFlWXh1zrPW9jqEdcN1JLJZEnng9gKyJ4GyMPON9Y0/edit?usp=sharing

hezhaobin commented 4 years ago

Got it!

On Tue, Jun 2, 2020, 1:53 PM rsmoak notifications@github.com wrote:

The link to my slides: https://docs.google.com/presentation/d/1sFFlWXh1zrPW9jqEdcN1JLJZEnng9gKyJ4GyMPON9Y0/edit?usp=sharing

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/binhe-lab/C037-Cand-auris-adhesin/issues/4#issuecomment-637741577, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFMLP4J3PMG2SRVAO2RGLRUVDBNANCNFSM4NHAW3CA .

hezhaobin commented 4 years ago

@janfassler you mentioned a paper about beta-aggregation in 8 C. glabrata adhesins. Can you post the link here?

hezhaobin commented 4 years ago

A brief summary of today's discussion (feel free to amend!)

Use literature to suggest thresholds for beta-aggregation feature
Plan a result section to compare the old and new C. glabrata genome
Need to understand MEME's algorithm to know how to use and interpret its results
- Are the repeat base (each unit) serine/threonine rich? @rsmoak @lindseyfaye
Construct a pan-proteome combining the three C. auris proteomes (using CD-Hit or recycle the all-against-all blastp results)

I also looked back at @lindseyfaye 's presentation from last week, and noted a few interesting points:

The structure model template is a bacterial adhesin -- is this ancient origination of adhesin genes or convergent evolution? To what degree are the sequences similar?
The MEME result shows that even among these putative homologs, which are likely identified based on the well-folded N-terminal domain, the larger C-terminal portion can be highly diverse in terms of repeat sequence and frequency, even among the three C. auris strains?!
- use synteny to further validate orthology??
Many homologs for Lindsey's protein are in telomeric regions. We should collect chromosomal locations for all predicted adhesins as a separate feature.

Lastly, summary for what I'll do next:

summarize OrthoMCL results, try different FungalRV thresholds (default: 0, recommended: 0.511), try overlaps with FaaPred.
I'm considering adding S. cerevisiae into the pool as a closer comparison to C. glabrata
Use OrthoMCL and many other features (protein lengths, chr position, GPI anchor, repeats, etc.) to vet the list and generate a hierarchical set of high, intermediate and low confidence adhesin sets.

janfassler commented 4 years ago

For what it's worth, the bacterial structure (the ligand-binding domain of the 2180 amino acid Lactobacilllus reuterii Lr70902) hit to Lindsey's protein made me think about lateral gene transfer, so I did several rounds of BLASTP this morning using the first 300 amino acids, 600 and all of Lindsey's protein against the unrestricted RefSeq database and came up with many fungal (including S. cer) but no bacterial hits. To confirm that my results weren't curtailed by a hitlist threshold, I repeated the process using a tax restriction to bacteria or to Lactobacillaceae but still came up empty. So despite the similarity in structure (Lindsey - can you confirm that this was done by threading?), there is no strong sequence relationship between bacterial LRRP beta solenoid type adhesins and this C. auris protein.

janfassler commented 4 years ago

This is the reference with examples of adhesins with amyloid character (TANGO): Ramsook CB, Tan C, Garcia MC, et al. Yeast cell adhesion molecules have functional amyloid-forming sequences. Eukaryot Cell. 2010;9(3):393‐404. doi:10.1128/EC.00068-09 Ca HWP/RBT Ca EAP1 Ca EPE1 Ca ALS Sc FLO1 Sc MUC1/FLO11 Sc AGA1/FIG2 NOT: Sc SAG1 Screen Shot 2020-06-02 at 4 08 54 PM

janfassler commented 4 years ago

About the pangenome - it looks like you are very close with OrthoMCL output: https://kbase.us/applist/apps/PangenomeOrthomcl/build_pangenome_with_orthomcl/release

hezhaobin commented 4 years ago

Oh, that's really interesting and useful result! @Snyder, Lindsey F lindsey-f-snyder@uiowa.edu should include in your result.

On Tue, Jun 2, 2020 at 4:00 PM janfassler notifications@github.com wrote:

For what it's worth, the bacterial structure (the ligand-binding domain of the 2180 amino acid Lactobacilllus reuterii Lr70902) hit to Lindsey's protein made me think about lateral gene transfer, so I did several rounds of BLASTP this morning using the first 300 amino acids, 600 and all of Lindsey's protein against the unrestricted RefSeq database and came up with many fungal (including S. cer) but no bacterial hits. To confirm that my results weren't curtailed by a hitlist threshold, I repeated the process using a tax restriction to bacteria or to Lactobacillaceae but still came up empty. So despite the similarity in structure (Lindsey - can you confirm that this was done by threading?), there is no strong sequence relationship between bacterial LRRP beta solenoid type adhesins and this C. auris protein.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/binhe-lab/C037-Cand-auris-adhesin/issues/4#issuecomment-637803484, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFMLIGGPJLAH6SOU5UDZTRUVR5ZANCNFSM4NHAW3CA .

-- Sincerely yours Bin

rsmoak commented 4 years ago

Hi everyone, my department just notified me about a mandatory lab reopening meeting on Friday at 1 pm. That may interfere with our scheduled meeting. Can we move ours either just before (noon) or sometime 2 or later? Thanks!

hezhaobin commented 4 years ago

No problem. Let's say 2:30-3:30, will that work for everyone? -- Bin

On Tue, Jun 2, 2020 at 5:03 PM rsmoak notifications@github.com wrote:

Hi everyone, my department just notified me about a mandatory lab reopening meeting on Friday at 1 pm. That may interfere with our scheduled meeting. Can we move ours either just before (noon) or sometime 2 or later? Thanks!

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/binhe-lab/C037-Cand-auris-adhesin/issues/4#issuecomment-637829880, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFMLL4KUQANVPFW7C4RADRUVZJPANCNFSM4NHAW3CA .

-- Sincerely yours Bin

janfassler commented 4 years ago

2 pm or later is perfect for me. Qualifying exams are that morning and I may be involved in proctoring them.

rsmoak commented 4 years ago

Perfect, thanks!

lindseyfaye commented 4 years ago

That works for me.

hezhaobin commented 4 years ago

I'm sure this is an incomplete list. Please reply to add your notes! Also, the slides for today can be accessed at here. All previous slides are in this folder Lastly, here is the interactive tool I built. I plan to update it so the user can restrict the output to the FungalRV>0.511 or FaaPred subsets.

OrthoMCL v5

~8 C. auris proteins map to an orthogroup that has zero members in the other three species within the predicted adhesin set. Given that C. auris is the only species NOT included in constructing OrthoMCL-DB v5, this observation deserves further investigation. Specifically
- Is this because of the single score cutoff leaving the orthologs in the other species out of the dataset?
- Or is this because C. auris used proteins whose ancestral functions are not related to adhesion?
- Or it could be that these C. auris proteins are not adhesins, but false positives.
Continue with analysis using v6r1, see if the results are consistent with v5. If not, which ones are different?
Can I (HB) make the gene tree reconstruction an automated pipeline and examine more orthogroups, to see if the lineage-specific expansion seen in OG5_132045 is general or specific to a few groups.
Don't discount the possibility that some seemingly unrelated domains, e.g. alcohol dehydrogenase, beta-glucosidase or glycosyl hydrolase, can exist in adhesin proteins.

Todo

Investigate the new C. glabrata genome [RS].
Gene tree for Lindsey's (and potentially Rachel's) adhesin family. [LFS] [JF]
- are Lindsey and Rachel's proteins orthologs? [LFS] [RS]
Implement gene tree reconstruction pipeline (only for the four species included in the OrthoMCL analysis). [HB]
Build relational databases to make it easier to query the various results. [HB] [RS]
Structure analysis figure, draft, method and result. [LFS]

Next meeting topic

Gene tree for Lindsey and Rachel's adhesin families.
Any other results worth discussion.

hezhaobin commented 4 years ago

@janfassler to your point of not pre-maturely ruling out enzymes, I was looking up whether ADH1 could have anything to do with cell adhesion and came upon this publication:

Klotz SA, Pendrak ML, Hein RC. 2001. Antibodies to alpha5beta1 and alpha(v)beta3 integrins react with Candida albicans alcohol dehydrogenase. Microbiology (Reading, Engl.) 147:3159–3164. PMID:11700367

"Abstract It has been hypothesized that Candida albicans possesses integrin-like receptors on its cell surface. This is because C. albicans binds numerous fluid-phase extracellular matrix (ECM) proteins on its cell surface and adheres to the same ECM proteins when immobilized. In addition, numerous antibodies to human integrins (receptors for ECM proteins) bind to the fungal cell surface and in so doing inhibit the binding of the respective proteins. To demonstrate the presence of such a cell surface integrin, a cDNA library of C. albicans yeast cells was screened with polyclonal antiserum to the human fibronectin receptor (alpha5beta1 integrin). Clones isolated by this screening technique also reacted specifically to antiserum against the human vitronectin receptor (alpha(v)beta3 integrin). DNA sequence analysis of the cloned insert predicted a 350 aa protein (37 kDa). This predicted protein showed 75% homology at the nucleotide sequence level to alcohol dehydrogenase (ADH) of Saccharomyces cerevisiae. In vitro transcription/translation of the cloned inserts yielded a 37 kDa protein that was immunoprecipitated with antibodies to the alpha5beta1 and alpha(v)beta3 integrins and an antibody to a C. albicans fibronectin receptor. These antibodies and an mAb to the human vitronectin receptor demonstrated an antigen of -37 kDa present in the cell-wall preparations of C. albicans and in spent growth medium. All four antibodies reacted with authentic ADH. The possible significance of these results in relation to C. albicans adherence is discussed."

rsmoak commented 4 years ago

@lindseyfaye @hezhaobin I'm working on the relational database, and was wondering if you had full results from some of the analyses you've run? Specifically, the numeric FungalRV and FaaPred results for the C. auris and S. cerevisiae queries? Or any other results that we may have filtered for before uploading resultant fasta files?

hezhaobin commented 4 years ago

Yes, the FungalRV results are in 01-global-adhesin-prediction/output/FungalRV/all-fungalrv-results-20200529.txt, and similarly for FaaPred. The former is a table while the later is just a list of protein IDs that passed the predictor's default threshold. Let me know if you can't find them.

For relational databases, the ones I was planning to use was SQLite and the R implementation of it called RSqlite. They are much easier to set up than MySQL and other similar full-featured SQL environments. Which one are you using?

On Tue, Jun 16, 2020 at 3:52 PM rsmoak notifications@github.com wrote:

@lindseyfaye https://github.com/lindseyfaye @hezhaobin https://github.com/hezhaobin I'm working on the relational database, and was wondering if you had full results from some of the analyses you've run? Specifically, the numeric FungalRV and FaaPred results for the C. auris and S. cerevisiae queries? Or any other results that we may have filtered for before uploading resultant fasta files?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/binhe-lab/C037-Cand-auris-adhesin/issues/4#issuecomment-645005499, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFMLMFF35YQMXNWEHRS2TRW7LQRANCNFSM4NHAW3CA .

-- Sincerely yours Bin

rsmoak commented 4 years ago

@hezhaobin I don't see an actual text file; I may just not know how to access it? This is what I get

I downloaded the RSqlite package and am working up a relational database schema right now.

hezhaobin commented 4 years ago

Yes, it's a "soft link" and the content you see points you to the actual text file. It's like a "shortcut" in Windows. So just go down to the local-result-HB folder and you will see the file there.

On Tue, Jun 16, 2020 at 4:32 PM rsmoak notifications@github.com wrote:

@hezhaobin https://github.com/hezhaobin I don't see an actual text file; I may just not know how to access it? This is what I get [image: image] https://user-images.githubusercontent.com/60475658/84830286-2aea4300-aff7-11ea-9795-e821efcefe69.png

I downloaded the RSqlite package and am working up a relational database schema right now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/binhe-lab/C037-Cand-auris-adhesin/issues/4#issuecomment-645023110, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFMLPE75TVOPUHDGE3JI3RW7QHBANCNFSM4NHAW3CA .

-- Sincerely yours Bin

hezhaobin commented 4 years ago

Notes for yesterday's meeting

[ ] Finish OrthoMCL v6r1 analysis. [HB]
[x] Run tango on other sequences in Jan's gene tree. [JF] [RS]
[x] Run FungalRV locally on the new C. glabrata proteome. [HB]
[ ] Test Albert's "Maximal" algorithm as a way to get at the repetitive motifs? [HB]
[x] Test to see if we can programmatically extract the beta-aggregation sequence motifs based on the Tango and XStream results. [RS]

hezhaobin commented 4 years ago

Notes for yesterday's meeting

[ ] Finish OrthoMCL v6r1 analysis. [HB]

[x] Run tango on other sequences in Jan's gene tree. [JF] [RS]

[x] Run FungalRV locally on the new C. glabrata proteome. [HB]

[ ] Test Albert's "Maximal" algorithm as a way to get at the repetitive motifs? [HB]

[x] Test to see if we can programmatically extract the beta-aggregation sequence motifs based on the Tango and XStream results. [RS]

OK, I have solved the problem with the new CBS138 proteome. Turns out the sequence I downloaded back in February was CDS (DNA) sequence. I just downloaded the protein sequence and ran FungalRV locally. The number of sequences with scores above 0 is 162, matching what @rsmoak got from the webapp. Mystery solved. The new version predicted 20 more adhesins (both at 0 and 0.511 cutoffs, meaning that all 20 new predictions have score > 0.511). For details see https://github.com/binhe-lab/C037-Cand-auris-adhesin/tree/master/01-global-adhesin-prediction/script/FungalRV_adhesin_predictor

hezhaobin commented 4 years ago

I've posted today's discussion notes under 00-misc-docs/2020-07-02-discussion-zoom.md

hezhaobin commented 4 years ago

A quick update and plan for next meeting:

I've been reconstructing the gene tree for XP_028889033, with more species and combining blast hits from both fungidb and refseq. Most of the results are now available in the case study folder under output/gene-tree. My next step is to analyze the resulting trees for gene gain and losses and color-coding the tips by species.
From my perspective, the next biggest thing is to work out the low complexity region properties, including the following questions:
- what types of repeats do each sequence have? do sequences that are more closely related (inferred from the N-terminal domain) also share the same repeat units and structure in the C-terminus?
- what is the relationship between repeat units and beta-aggregation motifs?
- are the repeats S/T-rich?
- technically, how can we use MEME + TANGO or other software combinations to answer the above questions?
With respect to our next meeting: I can present the Monday after next, mainly because I have an important talk to give next Thursday and preparing for it will take a large chunk of my time till then. @rsmoak will you have things to share next Monday? If so, we can meet in our normal time next Monday. Regardless, I think it would be the most helpful if you and @janfassler can put more thoughts into the repeat and beta-aggregation part.

Cheers Bin

rsmoak commented 4 years ago

@hezhaobin I can at least update on Monday, even if it isn't long. Let's meet at our usual time. I'll send an invitation.

hezhaobin commented 4 years ago

sounds good! -- Bin

On Thu, Jul 9, 2020 at 2:48 PM rsmoak notifications@github.com wrote:

@hezhaobin https://github.com/hezhaobin I can at least update on Monday, even if it isn't long. Let's meet at our usual time. I'll send an invitation.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/binhe-lab/C037-Cand-auris-adhesin/issues/4#issuecomment-656317858, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFMLNJWLNPKSVRGVDTKY3R2YNJNANCNFSM4NHAW3CA .

-- Sincerely yours Bin

janfassler commented 4 years ago

I'm available on Monday and will try to get something together as well. I've been reading about other types of amyloid proteins and at least one paper has asked evolutionary questions concerning the relationship between the repeats and the unique sequences; more or less what we are interested in, so I can talk about that. One problem I've wasted too much time on is that no matter how I run Tango at the command line (batch file, single sequence file, or simple input) I don't get the desired output. It's something else entirely, much less useful. I'm looking for workarounds. -Jan

rsmoak commented 4 years ago

Jan, can you send me a multi fasta of your sequences? I assume you want an output like the other TANGO file outputs. I'll try and troubleshoot TANGO.

hezhaobin commented 4 years ago

Lindsey has a zoom meeting next Monday from 3-3:45. @janfassler @rsmoak let's start at 3:45. I'll send an invite soon.

janfassler commented 4 years ago

I've been working on PCA - using the amino acid data in the Higgs and Attwood book as a test case. I made a markdown file that might work for other situations. @rsmoak can you direct me to the summary file of adhesin data that you have been compiling? I'd like to see what adaptations may be needed for this more complex dataset. Thanks.

rsmoak commented 4 years ago

@janfassler I haven't updated this with all of the new species I've been working on, but the current summary table can be found at 01-global-adhesin-prediction/output/combined_results.txt. The TANGO and XSTREAM results are very simplified in the summary table to agg_seqs (number of aggregation sequences past the TANGO thresholds), num_tr (number of tandem repeats in the protein), respectively.

janfassler commented 4 years ago

Thanks! Probably, the simpler the better, for a first pass!

hezhaobin commented 4 years ago

@rsmoak @janfassler @lindseyfaye I summarized my presentation yesterday: https://github.com/binhe-lab/C037-Cand-auris-adhesin/blob/master/00-misc-docs/2020-07-20-discussion-zoom.md

One interesting new finding: when I summarized the results of GPI-anchor prediction, I found that while the vast majority of the 110 sequences were predicted to have an GPI-anchor, supporting their potential role as an adhesin (anchored on the cell wall), only 5/17 homologs in the mysterious S. stipitis were predicted to have a GPI-anchor. I feel the prediction for GPI-anchor is likely to have high sensitivity and specificity, given its relatively simple rule (N-terminal signal peptide and C-terminal GPI signal peptide), the large number of homologs in S. stipitis may actually be involved in some other processes!

todo

[x] Parse TANGO results and identify alternative enriched motifs in non C. auris group sequences
[x] Explore alternative ML trees from the bootstrap deck and investigate differences in their topology.

hezhaobin commented 4 years ago

@janfassler @rsmoak @lindseyfaye I created a shared presentation file for us to add content -- for next Monday's meeting. Jan, Rachel, if you have most of your slides in powerpoint format, feel free to continue working with that. When you are done, just upload it here: https://drive.google.com/drive/folders/1EdSbLmY5Dzml7BjGU6ROISzPMVDeRVI9?usp=sharing

hezhaobin commented 4 years ago

Also, a quick update on @janfassler 's question regarding whether the N-terminal domain (350 aa) could have homologs in phage/bacteria -- I did a HMMER search with the first 350 aa and restricted the taxonomy to viruses, archaea and eubacteria. The e-value cutoff is 0.01 and no matches were found. I then repeated the search with blastp against the non-redundant protein database restricted to viruses and bacteria. This time I got three significant hits! And both the percent identity and query coverage are respectable. However, I'm now puzzled as to why only P. syringe has this domain? And only in some strains? See below for details (scroll to the bottom): https://github.com/binhe-lab/C037-Cand-auris-adhesin/tree/master/02-case-studies/output/blast

hezhaobin commented 4 years ago

@janfassler I finally checked the e-value cutoff question you raised regarding the identification of XP_028889033 homologs. My conclusion, based on two analyses, is that they are genuine homologs. See the last section of the analysis shown below: https://rpubs.com/emptyhb/649295

janfassler commented 4 years ago

@hezhaobin I see what you've done and I agree that the proteins you pulled out likely do have valid N-terminal hyphally regulated domains which is what we agreed to look for and to use in our phylogenetic analyses. As an aside, I do think it's important to remind ourselves that the proteins we identify this way (via BLAST) are homologs of that domain, and not necessarily homologs of the (full-length) query protein. Likewise, the phylogeny is of the domain and not necessarily of any particular adhesin.

hezhaobin commented 4 years ago

I fully agree. Will be careful in describing this result and discussing the evolutionary dynamics of this group of genes sharing this N-terminal domain. More analysis of the clustering of the C-terminus, by looking at the types and distribution of short motifs such as beta-aggregation sequences and MEME identified motifs could shed more light onto the evolutionary history of this group of genes.

On Sat, Aug 15, 2020 at 2:23 PM janfassler notifications@github.com wrote:

@hezhaobin https://github.com/hezhaobin I see what you've done and I agree that the proteins you pulled out likely do have valid N-terminal hyphally regulated domains which is what we agreed to look for and to use in our phylogenetic analyses. As an aside, I do think it's important to remind ourselves that the proteins we identify this way (via BLAST) are homologs of that domain, and not necessarily homologs of the (full-length) query protein. Likewise, the phylogeny is of the domain and not necessarily of any particular adhesin.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/binhe-lab/C037-Cand-auris-adhesin/issues/4#issuecomment-674438171, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFMLI43Z6GUYLTAKGEOILSA3OEVANCNFSM4NHAW3CA .

-- Sincerely yours Bin

hezhaobin commented 4 years ago

discussion notes (my recollection) and todo list here @lindseyfaye https://github.com/binhe-lab/C037-Cand-auris-adhesin/blob/master/00-misc-docs/2020-08-17-discussion.zoom.md

@janfassler can you send me the paper documenting the Ser vs Thr difference, and the mutagenesis analysis of the beta aggregation sequence, if it's not the one Lindsey showed at the end today?

janfassler commented 4 years ago

Just to clarify @hezhaobin hypotheses 1 and 2 in your summary are alternatives, yes? They aren't actually meant to be considered together.

This is the paper I had in mind with experimental mutation of a beta-aggregation prone sequence: Rousseau et al., 2006. Protein aggregation and amyloidosis; confusion of kinds? Curr. Op. in Structural Biology. .
Seems like a good guess that amino acid 2-7 of the 7 amino acid sequence GVVIVTT corresponds to positions 1 through 6 in the figure.

Another paper that might be helpful in this regard is Ramsook et al., 2010. Yeast cell adhesion molecules have functional amyloid-forming sequences. Eukaryotic Cell.
Table 1 from this paper spells out the beta aggregation sequences in various adhesins in C. albicans and in S. cerevisiae, all very hydrophobic (like our GVVIVTT). The authors comment: Ile, Thr, and Val residues have aliphatic β-branched side chains that greatly restrict backbone conformation and have high β-strand potential (6). These residues are very hydrophobic, bulky, and have side-chain interactions that stabilize the β-sheets in amyloids. These properties are what we might expect in sequences whose primary purpose is to form amyloids. In contrast, the adhesin sequences had very few aromatic residues, which are the major category of β-aggregation- and amyloid-prone sequences in other proteins. Thus, the β-aggregation-prone sequences in the adhesins are also biased against aromatic residues. We suggest that the unusual composition of the adhesin amyloid sequences leads to the unusually facile amyloid formation that these peptides and proteins display.

Finally, I'm attaching a few slides with my observations about the differences in distribution of serine threonine bias in C. albicans ALS proteins versus Rachel's C. auris adhesin. Serine-Threonine.pptx

hezhaobin commented 4 years ago

Just to clarify @hezhaobin hypotheses 1 and 2 in your summary are alternatives, yes? They aren't actually meant to be considered together.

Yes, they are alternative, mutually-exclusive hypotheses.

Got the papers. Will look into them.

hezhaobin commented 4 years ago

@janfassler Jan, you once mentioned a paper that talked about the Serine-rich domain, and that was the motivation for looking at Serine and Threonine content separately. Can you remind me of that paper and the idea behind?

janfassler commented 4 years ago

Hi Bin,

@hezhaobin It was the ALS5 paper with the figure below that caused me to map the serines and threonines individually in Rachel’s protein.

Otoo HN, Lee KG, Qiu W, Lipke PN. Candida albicans Als adhesins have conserved amyloid-forming sequences. Eukaryot Cell. 2008;7(5):776-782. doi:10.1128/EC.00309-07

Below find the following 3 images which can also be seen in the Powerpoint (Serine_Threonine.pptx) that I posted above a few weeks ago.

Image 1: Domain cartoon of Als5 from Otoo paper Image 2: Although it’s true that the percent serine and threonine are both high in Rachel's protein, the distribution differs with the C terminus being enriched for threonine and serines scattered throughout Image 3: Domain cartoon of Rachel's protein

hezhaobin commented 4 years ago

Thanks Jan! Should have read your previous reply - have to say that I still haven't absorbed the information with respect to beta-aggregation prone vs amyloid forming sequences, and the role of serine, threonine and glycosylation in these contexts. Have you read anything that would suggest a reason for the serine-rich domain immediately after the N-terminal Hyphal_reg_CWP, followed by a moderately Threonine-rich stalk?

Hi Bin,

@hezhaobin It was the ALS5 paper with the figure below that caused me to map the serines and threonines individually in Rachel’s protein.

Otoo HN, Lee KG, Qiu W, Lipke PN. Candida albicans Als adhesins have conserved amyloid-forming sequences. Eukaryot Cell. 2008;7(5):776-782. doi:10.1128/EC.00309-07

Below find the following 3 images which can also be seen in the Powerpoint (Serine_Threonine.pptx) that I posted above a few weeks ago.

Image 1: Domain cartoon of Als5 from Otoo paper Image 2: Although it’s true that the percent serine and threonine are both high in Rachel's protein, the distribution differs with the C terminus being enriched for threonine and serines scattered throughout Image 3: Domain cartoon of Rachel's protein

binhe-lab / C037-Cand-auris-adhesin

Share progress results #4

OrthoMCL v5

Todo

Next meeting topic

todo