Closed dlebauer closed 7 years ago
@dlebauer In preparation for bulk upload, I had to make the following changes:
Column "sitename" must be "site". Hyphens are required in date values; e.g., 2016-08-08, not 20160808.
"NDVI" must be a trait variable in the trait_covariate_associations
table. I added a row for trait NDVI (with optional covariate "age").
Entries in the citations_sites
table must be added to associate each of the sites in the file with Ward 2016. (I haven't done this yet. UPDATE: THIS IS DONE NOW.)
I will have to add the method associations manually. The Bulk Upload wizard doesn't provide for this.
As you noted, the genotypes (cultivar_id
values) will need to be added manually as well.
I'm ignoring the Range and Pass columns. This information is of course already in the site column. I'm also ignoring the rep column.
I expect to finish this up tomorrow afternoon unless you need it before then.
@dlebauer I was going to finish this this afternoon but I realized I need to add a treatment for this data set. (No existing treatment is associated with the Rick Ward citation.) What should the treatment be?
@gsrohde the treatment should be the same as the one for Maria's data, and if it does not exist it can be 'observational'
@dlebauer The names of Maria's treatments are "Control", "low density", "medium density", and "high density". Which, if any of these, should I use? Or is it better to have distinct treatments for distinct citations (even if the treatments are essentially the same)? Or does it matter?
Use "Control"; density treatments were from the first season.
Also, what should the access level be?
2
On Tue, Nov 8, 2016 at 5:28 PM Scott Rohde notifications@github.com wrote:
Also, what should the access level be?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terraref/computing-pipeline/issues/201#issuecomment-259292612, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcX57zlL0zWreahNFibaF3qAL_tUf_uks5q8QWwgaJpZM4KroH4 .
OK, done (except for adding genotype information).
Need to add genotype data before closing.
@rickw-ward do you have times for these flights?
@dlebauer I have lost the thread- which flights?
@rickw-ward looks like we have the dates: https://docs.google.com/spreadsheets/d/1qeyYA4x8WX_OnahC6Dr7k1Ut4TuSObMJyYFvOyA_1gY/edit#gid=0
@gsrohde have these been uploaded?
Method should be https://terraref.ncsa.illinois.edu/bety/methods/6000000003
@rickw-ward can you confirm - was the NDVI measured by the Parrot Sequoia or the MicaSense camera?
@dlebauer I uploaded this data set on 11/8 (see https://github.com/terraref/computing-pipeline/issues/201#issuecomment-259294680). The dates were already in the upload, and I set the method manually to "UAV based NDVI" (on 12/02?—they were all updated on that date). So to reiterate, as far as I know, the only thing remaining is to add the culitvar_id
values to tie in the genotype information (the solitary checkbox in the Description section).
@gsrohde final step for updating the cultivar_id is to run the updates in this sheet that adds a cultivar_id given a particular site_id.
After running the relevant updates, you can close this issue and then open a new one in which we can consider a more automated solution for capturing the cultivar-plot relationships. My proposal to add cultivar_id to the experiments_sites table (https://github.com/PecanProject/bety/issues/410#issuecomment-217205761) might work. The problem is that the solution is very specific to breeding trials. What do you think?
@dlebauer I looked at the spreadsheet with the update statements and there are some problems:
The update statements have no WHERE clauses, so each one will update the whole traits table. For example, where you wrote
update traits set site_id = (select id from sites where name like 'MAC Field Scanner Plot 1 Season 2%'), cultivar_id = (select id from cultivars where name = 'Ton-a-Milk');
I'm assuming you perhaps intended
update traits set cultivar_id = (select id from cultivars where name = 'Ton-a-Milk') WHERE site_id = (select id from sites where name like 'MAC Field Scanner Field Plot 1 Season 2');
(Notice the extra "Field" before "Plot": all season 2 sitenames containing the string "Plot" have the form "MAC Field Scanner Field Plot % Season 2" where "%" is some integer.)
But even as corrected, these updates won't touch the traits in the set I uploaded on 11/8/2016. All site names in that data set have the form "MAC Field Scanner Season 2 Range % Pass %" whereas the site names in the "update" spreadsheet all have the form "MAC Field Scanner Field Plot % Season 2" (except in the latter part of the table where column C (the cultivar name column) is empty).
If you gave me verbal instructions superseding the instructions in comment https://github.com/terraref/computing-pipeline/issues/201#issuecomment-267458905, I don't recall what they were.
The cultivar_id
updates (as corrected) will, however, apply to the traits @ZongyangLi uploaded on Jan. 12 since those do involve sites with names of the form "MAC Field Scanner Plot % Season 2".
I think the update statements here will work https://docs.google.com/spreadsheets/d/1xIRwroYObD125I0TDCDp3YPIGPi2Qzeh-vlUCg_b85Q or at least provide enough information to identify site-cultivar pairs
@dlebauer I assume you mean use columns R and S in https://docs.google.com/spreadsheets/d/1xIRwroYObD125I0TDCDp3YPIGPi2Qzeh-vlUCg_b85Q/edit#gid=732810075 (the fourth sheet) to generate the update statements?
Yes On Thu, Jan 19, 2017 at 8:51 AM Scott Rohde notifications@github.com wrote:
@dlebauer https://github.com/dlebauer I assume you mean use columns R and S in https://docs.google.com/spreadsheets/d/1xIRwroYObD125I0TDCDp3YPIGPi2Qzeh-vlUCg_b85Q/edit#gid=732810075 (the fourth sheet) to generate the update statements?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terraref/computing-pipeline/issues/201#issuecomment-273812901, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcX53CMq_M_y6egqff8qrMwBJVthKDvks5rT4Z9gaJpZM4KroH4 .
@dlebauer I did the updates—all the trait rows I inserted on 2016-11-08 now have cultivar ids. Note that many site names in column R don't match any existing trait sites.
many site names in column R don't match any existing trait sites.
that is expected result of not having data for the border rows.
Description
@rickw-ward has provided NDVI data for five dates.