Closed dlebauer closed 4 years ago
Based on a quick look, the updated format looks good
The format looks good.
One thing to note is that each metadata value in the table needs to correspond to a record in the database. Therefore, you will need to
Note that many fields are optional - you only need to fill out the relevant ones required to uniquely identify a record.
I think that there is a way to test out the data with the API, that will return these and any other errors that would prevent the data from being inserted. @max-zilla do you know the best way to do this?
@dlebauer Could you also add @abby621 to our project. She is also going to follow up this.
@dlebauer I think @Andrewzhzy is Zeyu Zhang from GWU, can we add him here to follow this up?
@ZongyangLi sorry about that; I've added Zeyu and Abby
Hi @dlebauer, I signed up on the betydb, and tried to create citations. But it seems my account don't have the permission to do so. Is there anything else I should do to to get the permission? Thanks!
Hi Zeyu - I just updated your permissions - you should now be able to create new records.
Thanks! But the method example seems still cannot be opened: Scanner 3d ply data to height https://terraref.ncsa.illinois.edu/bety/methods/6000000009
Hi @dlebauer , we just got the result of the leaf length also the leaf width. The code and descriptions are here: https://github.com/GWUvision/Sorghum-Leaf And the csv format data: https://1drv.ms/f/s!AmSJkn9rHbVrob4jOTGcm7eo9oSU5Q Could you help me to see if the format works? Also how could I upload these to betydb?
Thanks
First step is to add a record in the BETY database to describe the method: https://terraref.ncsa.illinois.edu/bety/methods/new. Or, we could change the one for "Scanner 3d ply data to leaf length" to "Scanner 3d ply data to leaf length and width" if one algorithm is doing both.
Next step is to try either the bulk upload here: https://terraref.ncsa.illinois.edu/bety/bulk_upload/start_upload or the API https://pecan.gitbook.io/betydbdoc-dataentry/adding-trait-and-yield-data/adding-traits-via-the-v1-api
I would start with a small subset of the file to check that the format and content is correct (the error messages can be helpful, but submitting a large file at once is hard to do). if you have questions, @kimberlyh66 and I can help with this (and we are also on slack).
I will try uploading this today
I notice that the plot names refer to "Season 4" but the dates refer to Season 6. @Andrewzhzy these are Season 6 data right? If so we can just fix the plot names I think.
Let me double check exactly where was went wrong.
Sorry, it's Season 6. I mislabeled the season number under the site name. Could you help me to fix that? Thank you
@Andrewzhzy to confirm, what @max-zilla suggested in https://github.com/terraref/reference-data/issues/259#issuecomment-516479447 is correct?
@max-zilla I am not sure if this is the prettiest approach but I think that this will generate the update statements that are required.
WITH table1 AS (
SELECT
traits.ID,
sites.ID AS old_site_id, sitename as old_sitename
FROM
traits
JOIN sites ON traits.site_id = sites.ID
WHERE
EXTRACT ( YEAR FROM traits.DATE ) = 2018
AND sitename LIKE'%Season 4%'
AND variable_id in (select id from variables where name in ('leaf_length', 'leaf_width'))
) SELECT
table1.ID,
table1.old_site_id, table1.old_sitename,
sites.id AS new_site_id , sites.sitename as new_sitename,
concat('update traits set site_id = ', sites.id, ' where id = ', table1.ID, ';') as update
FROM
table1
JOIN sites ON REPLACE ( old_sitename, 'Season 4', 'Season 6' ) = sites.sitename;
I am recieving a constraint violation and it appears that I need to fix cultivar_id values, but this doesn't seem to work:
update traits set cultivar_id = NULL where EXTRACT ( YEAR FROM traits.DATE ) = 2018
AND site_id in (select id from sites where sitename LIKE'%Season 4%')
AND variable_id in (select id from variables where name in ('leaf_length', 'leaf_width')) ;
@dlebauer Yes, the plot name should be season 6. Thank you for taking care this bug I made.
@Andrewzhzy I think it would be easier if I delete all of these records - could you fix the csv files and try uploading them again?
@dlebauer I just updated the csv files. I'm ready to upload them anytime. https://1drv.ms/f/s!AmSJkn9rHbVrob4jOTGcm7eo9oSU5Q
@Andrewzhzy did you upload the fixed data? we can close this issue once it's done and we confirm they are there.
Note that local_datetime
column values should be in YYYY-MM-DD format. (See the Date section of https://pecanproject.github.io/bety-documentation/dataentry/adding-trait-and-yield-data.html#detailed-csv-data-file-specifications (for Bulk Upload) or https://pecanproject.github.io/bety-documentation/dataentry/inserting-new-traits-via-the-api.html#schema-for-csv-data-files (for API upload).)
@gsrohde Thank you for pointing out, I've corrected that. I'm trying to upload, but I always encounter an error while using bulk upload: "Site is inconsistent with citation in these rows".
Fixed, for reference:
begin;
INSERT INTO citations_sites (citation_id, site_id)
SELECT c.id, s.id
FROM (select id from citations where author = 'Zeyu Zhang')
AS c CROSS JOIN
(select id from sites where sitename like 'MAC Field Scanner Season 6%')
AS s;
commit;
(From @ZongyangLi )
Zeyu from GWU successfully ran the leaf length pipeline from our laser data. The current output data are here: https://drive.google.com/open?id=143bCjxCv61Vf0uE60Z5clLUFQfhP2kox
In order to re-form it into a BETYdb friendly version, I made another example of output csv file here: https://docs.google.com/spreadsheets/d/1_h3xq_N8x9G35aji40KsJSZqnqmXAXKFYJz3_74cw4c/edit?usp=sharing
If this doesn't work, please let us know what's the best format for leaf length data. If this works for you, I will work with Zeyu together to get it finalize.