RobertsLab / resources

https://robertslab.github.io/resources/
18 stars 10 forks source link

Next steps for mytilus byssus expression/ did I do this PCA correctly #1799

Closed graceleuchtenberger closed 4 months ago

graceleuchtenberger commented 6 months ago

I finally got a PCA to work with some code from Sam's snow crab repository. Here's the plot: Screenshot 2024-02-05 at 2 09 59 PM

See the overall code here.

Here's the code section I used to generate the plot:

Visualize differences between treatment groups

# Filter data
treatmentinfo.F <- treatmentinfo.2 %>% filter(tissue == "F" | day == "3") %>%
  filter(!(day == "0" | tissue == "G"))
treatmentinfo.F$sample.1 <- factor(treatmentinfo.F$sample.1)
treatmentinfo.F$tissue <- factor(treatmentinfo.F$tissue)
treatmentinfo.F$treatment <- factor(treatmentinfo.F$treatment)
treatmentinfo.F$day <- factor(treatmentinfo.F$day)
countmatrix_F <- subset(countmatrix_2, select=row.names(treatmentinfo.F))

# Calculate DESeq object
dds2 <- DESeqDataSetFromMatrix(countData = countmatrix_F,
                              colData = treatmentinfo.F,
                              design = ~ treatment)

dds2 <- DESeq(dds2)
resultsNames(dds2) # lists the coefficients
summary(dds2)

# Plot PCA

PCAdata <- plotPCA(vst(dds2), intgroup = c("treatment"), returnData=TRUE)
percentVar <- round(100*attr(PCAdata, "percentVar")) #plot PCA of samples with all data
ggplot(PCAdata, aes(PC1, PC2, color=treatment)) + 
  geom_point(size=4, alpha = 5/10) +
  xlab(paste0("PC1: ",percentVar[1],"% variance")) +
  ylab(paste0("PC2: ",percentVar[2],"% variance")) + 
  coord_fixed()

Where do people want to go from here? Did I do the PCA correctly?

kubu4 commented 6 months ago

Regarding where to go from here, @sr320's comment still applies:

graceleuchtenberger commented 6 months ago

Thanks Sam! I think Matt wanted to see a PCA across all four treatments first, so that's why I did what I did. Hoping to get a go ahead to see if a pairwise comparison is worth it given the graph above.

sr320 commented 6 months ago

Yes - I say pairwise info is worth getting.

mattgeorgephd commented 6 months ago

Nice work Grace!

As far as the PCA goes, some improvement could be:

  1. Add day zero foot samples as their own treatment (true control). The control listed is foot tissue from day 3, which is technically a "treatment control".
  2. Add circles (https://stackoverflow.com/questions/34869768/how-to-make-a-pca-plots-as-i-posted-here)

Agree that next steps are trt vs control comparisons of gene expression. The PC1 vs PC2 axis didn't pick up much of the variance, so there is likely a lot of overlap. A list of genes that are differentially expressed across each treatment relative to the true control. I would start with just significant and not filter by lfc to start.

graceleuchtenberger commented 6 months ago

Here's the new graph with the true control. This is just to update you all, I'll keep on with the pairwise comparisons. "Control_3" is day 3 of the control treatment. The pairwise comparisons for the PCA were:

[1] "Intercept" "treatment_control_3_vs_control" [3] "treatment_DO_vs_control" "treatment_OA_vs_control"
[5] "treatment_OW_vs_control"

Screenshot 2024-02-06 at 11 23 22 AM

mattgeorgephd commented 6 months ago

Wow, really interesting! Looks like bringing them into the lab really made a difference.

graceleuchtenberger commented 6 months ago

I did pairwise comparisons (true control to everything else), got PCA's and DEG lists. See the PCA's here

mattgeorgephd commented 6 months ago

Amazing work @graceleuchtenberger!

kubu4 commented 6 months ago

Agreed with @mattgeorgephd - very nice!

Some interesting stuff in those PCAs (e.g. foot shows clear differences in response to OA, but gill do not...)

AHuffmyer commented 6 months ago

Hi Grace! Jumping in here to say cool plots and results! I have a question for my own knowledge - would it make more sense to do a 2 step analysis: 1) DEGs/comparison between control 0 and control 3 to account for tank effects followed by 2) control 3 to all other treatments. If you do a comparison between control 0 and all other treatments, your differential effects are going to be confounded by the change they experienced being brought into the lab. Is a more relevant comparison actually control 3 to all other treatments on that same day?

mattgeorgephd commented 6 months ago

Agreed, a two step analysis would be helpful here. Here is the approach we came up with. Interested in everyone's input:

  1. Generate DEG lists of treatment vs. control 0 (lab control) - gives you impact of lab exposure
  2. Generate DEG lists of treatment vs. control 3 (treatment control) - gives you impact of treatment
  3. Filter DEG lists for significance
  4. Normalize lfc value obtained in step 2 using lfc value in step 1, only including sig DEG obtained in step 2

I think this accomplishes the same thing as the approach that @AHuffmyer suggested, but has the potential to more accurately capture tank level effects as they relate to animals in each treatment. Treatments in this case were not pseudo-replicated by splitting animals into different chambers, so we really are looking at tank level effects.