terraref / reference-data

Coordination of Data Products and Standards for TERRA reference data
https://terraref.org
BSD 3-Clause "New" or "Revised" License
9 stars 2 forks source link

Standard naming scheme for plots #60

Closed dlebauer closed 7 years ago

dlebauer commented 8 years ago

Following terraref/computing-pipeline#187

Background

We need a standard way of defining plots.

How these will be used and grouped:

for reference, here is the schema for the sites and related table

First use case, the Maricopa Field

From @NewcombMaria

Both crops had 2 row border plantings on the East and West sides of the field

Use a plot naming system for plots where the following would be valid:

tagging others: @rickw-ward @tingli3 @Mamatemenrs @ZongyangLi @yanliu-chn @craig-willis

ZongyangLi commented 8 years ago

A 'Range' extends in the Y direction (East-West)

Is this 'Range' means in the North-South direction?

dlebauer commented 8 years ago

@ZongyangLi in the gantry coordinate system, the Y direction is East West and the X direction is North South. This is not intuitive and has been the source of great confusion. See https://github.com/terraref/computing-pipeline/issues/174#issuecomment-250893581

ZongyangLi commented 8 years ago

@dlebauer Yes I know. So to my understanding, this 54 Ranges means X direction in gantry system and it's in the North-South direction in a real world coordinate system.

dlebauer commented 8 years ago

@ZongyangLi sorry - I think I was confused because the range numbers increase going north but each one is oriented E-W.

rickw-ward commented 8 years ago

I yield to @NewcombMaria on terminology. As I use it, Range 1 is in the south, range 54 at the north. Ranges are adjoining plots between two alley walk ways (which go east/west, i.e. in the Y vector of the gantry).

NewcombMaria commented 8 years ago

@dlebauer, David, if it's helpful Jeff White and I came up with a definition for 'field plot'. For the Maricopa field, it's useful to consider that there are experimental plots as you mentioned 'the unit of replication in an experimental design', and also border plots which could be useful for observational data. Plot numbering will be different for each planting.

dlebauer commented 8 years ago

@gsrohde could you please do the following:

gsrohde commented 8 years ago

@dlebauer I've made all of these changes. Note, however, that 'Season 2' appears last in site names that have a Field Plot number whereas it appears directly after 'MAC Field Scanner' in names that have a Range and Pass number.

ghost commented 7 years ago

@dlebauer can this be closed?

NewcombMaria commented 7 years ago

@dlebauer this is a good time to establishe the preferred sitename format for BETYdb, and preferred plot naming scheme because we are currently putting together the spreadsheet for the next planting (durum wheat). 3 questions to be answered before the start of this next winter crop: 1) Considering the options below, is there a preference for either 1) a Plot location (for example the first two options in the list), or for 2) a Plot number (third option on the list)?

Season Range Column assuming 'column' is a useful concept Season Range Pass Season Plot Season Plant 2) For , is this the year of planting, or year of majority of data collection? This next planting will be December 2016, but emergence and start of data collection will most likely be in January 2017. What is the standard for when it crosses over 2 years during the winter? 3) What is the preferred way to handle subplots in BETYdb going forward? We discussed 'entity' for rows within plots - is that preferred? The last sorghum crop was planted in 4-row plots. The next wheat planting will be planted in 2-row plots (2 subplot rows, E and W). The sitenames could be individual rows, or the sitename could be the entire 2-row plot. Which is preferred?
dlebauer commented 7 years ago

@rmgarnett

rmgarnett commented 7 years ago

Ah, so what I was trying to explain earlier today, there are 32 rows, that are currently logically arranged as

[2 rows of border] ([4 rows of same genotype] x 7) [2 rows of border]

and we are looking at the interior two rows of each of the 7 "4-row" plots. Numbering them as 16 two-row plots results in the following division:

[2 rows of border] ([2 rows of same genotype as ->, only care about right] [2 rows of same genotype as <- only care about left] x7) [2 rows of border]

Which is very odd. The 2-row plot division seems to accomplish nothing useful. Seems like any of these would make more sense:

(1 2) (3) (4 5) (6) (7) (8 9) ...

So I can easily pull out the plot corresponding to (4 5), which is currently impossible.

Personally, I think simply numbering by row rather than arbitrary and confusing 2-row plots would allow any later change to the planting scheme to be easily dealt with.

NewcombMaria commented 7 years ago

Thanks Roman for your description of the Sorghum Season 2 layout, with 2-row borders and 4-row 'plots'. I tried to illustrate what you described in a powerpoint slide. There's good reason to break the polygons and numbering scheme down to the smallest unit of row, with 32 rows across, as you suggest. The term 'plot' to most people will be associated with an experimental unit, which can be multiple rows or single rows, but the numbered units should probably be the rows and these can be combined as necessary in different seasons and different planting designs.

image

rmgarnett commented 7 years ago

Fantastic picture, thanks!

ghost commented 7 years ago

from Nadia:

Plots can be defined as the field area that is occupied by a single genotype in a given rep. We can do 1 row, 2 row, 3 row, 4 row plots (whatever we want) depending on how many rows of one genotype we want in a particular area. This will change from experiment to experiment.

It is best to keep the data separated by row, but we need to have the ability to combine row information in order to generate plot-level data.

ghost commented 7 years ago

See https://github.com/terraref/reference-data/issues/114 for more discussion