Closed jpalmer37 closed 5 years ago
The minimum counts are always going to be variable and sensitive to sample size. I would start with a simple visualization of PNGS frequencies by site (two bar plots, acutes above the horizontal axis and chronic/AIDS below).
The figure you requested previously:
And a scatter plot comparing frequencies of N-glyc sites common to both acute and chronic cases. (N-glyc sites unique to one of the groups thus falling along the x and y axes were excluded for now)
You have a good point. Wouldn't make sense that they missed something so prominent. I'll double-check my algorithm for making that plot and visually see whether these sites are in fact different in essentially all cases.
Also check that their sequences are in your data sets - if they don't find stark differences between acutes and chronics in their data, and you have those sequences in YOUR data, then it shouldn't be possible for you to see 0% of a PNGS in acute and 100% in chronic.
I figured it out. Really silly mistake on my part. I set my ylim
parameter to max out at 250 when the max count for acute was 257. Four bars were excluded because they got cut off.
Here's the figure of the fixed N-glyc counts:
and the fixed scatterplot (added in values unique to one group, fixed the denominator):
Will pass on to Adam for reference data.
Context: I downloaded all available gp120 sequence data on LANL that contained one of three tags (acute, chronic, AIDS). I created a MSA with the conserved regions, generated a RAxML tree, pruned this tree down to 50% of sequences, and extracted all glyc sites from these sequences.
You mentioned previously that I could simply look at the distributions of N-glyc counts in acute and chronic patients.
Apart from the lowest counts in the chronic group, acute and chronic sequences in this data set appear to have little to no difference in their distribution of glycosylation site counts.
Is there anything you'd like me to test with this current data set? Or a different course of acute/chronic patient data you can think of?