PNNL-PREMIS / SULI2019

Repository for summer 2019 SULI interns
0 stars 0 forks source link

Developing a QC script for the SLA data #17

Closed bpbond closed 5 years ago

bpbond commented 5 years ago

Tasks here (please check off when completed):

Data

Reading in data and computing SLA

We'd prefer that you each do the following, in your own branches. But if you want to work on a common branch and split the tasks, that's fine too.

Joining

Please open a PR with this code. In the PR comments, paste in the beginning of the output table and plot(s).

haileymoore commented 5 years ago

A box plot showing the SLA values for each species with individual data points overlayed and labeled by tag #. The #'s are offset horizontally, but the points are too close together to be able to separate them vertically SLA_boxplot

bpbond commented 5 years ago

@hmoore28 Nice job! 👏 Cool. Re the overlapping text, remember geom_jitter() and @stephpenn1 has experience with ggrepel() which I believe 'repels' labels (but not points) from each other for readability. Can you color by canopy position please?

Interesting that the SLA values for most species cluster pretty tightly, with some significant exceptions. @lilliehaddock we're going to want to look closely at outliers, e.g. samples from tree 1560.

haileymoore commented 5 years ago

The box plot is now colored by canopy position. I tried both geom_jitter() and geom_text_repel(). When using both the box plot and overlaying each individual data point, the data points that are outliers and represented individually show up twice - once as an outlier and once as an individual data point, i.e. 1739 in NYSY.

geom_jitter() looks to move the points around but not the labels. boxplot_position_jitter

With ggrepel(), the labels are actually readable, but there are still so many that the plot is super messy.

boxplot_canopy_color_ggrepel

bpbond commented 5 years ago

Thanks @hmoore28 ! Nice. Interesting to see those differences. Again, @lilliehaddock we're going to want to check some of those outliers (e.g. 1739, 1621, 1560): wrong species? Mistake in data entry? Or just a goofy leaf, i.e. an oddball but a real one?

lilliehaddock commented 5 years ago

I used geom_text_repel() here, but it still looks cluttered. Rplot SLA by species and position

I'll look into the outliers and see what is going on there.

marideeweber commented 5 years ago

Weber SLA

bpbond commented 5 years ago

Very nice! Maybe faceting by tree species would be clearer? Also, why do your plots have an NA tree species but Hailey's doesn't? 🤔

lilliehaddock commented 5 years ago

The "by 1802" LITU doesn't have a species in the inventory so it was showing up as NA! Haley helped me fix that issue.

lilliehaddock commented 5 years ago

Below is the SLA plot faceted by plot. SLA by species, plot, and tag

This is a more simplified version with no tag labeling. Its interesting to see how SLA varies by plot!! SLA by plot

Here is the SLA faceted by plot. I'm not sure how to remove the CAGL8 plot though!! SLA by species and plot

bpbond commented 5 years ago

Interesting! Thanks.