solgenomics / sgn

The code behind the Sol Genomics Network, Cassavabase and other Breedbase websites
https://solgenomics.net
MIT License
66 stars 35 forks source link

Repeat observation not allowed in PUT brapi/v1/observations anymore #3630

Open ctucker3 opened 3 years ago

ctucker3 commented 3 years ago

There was a changed in this PR:

https://github.com/solgenomics/sgn/commit/b3399347190f7c66142a96cea8ea08068e3d9d74

So that repeat observations are now ignored. Not sure if this was an intentional change, but it would be useful to still be able to store repeat observations. @bellerbrock Do you remember if you had a reason for that change?

MFlores2021 commented 3 years ago

it affects v2 /observations also

ctucker3 commented 3 years ago

Also, the guilty line is 568.

lukasmueller commented 3 years ago

needs more discussion :-) Is this a feature or a bug? Others have requested upload of multiple observations per trial and trait... Do not allow values with the same timestamp though The problem is the download, it assumes one value per trial/trait, so that needs to be fixed

ctucker3 commented 3 years ago

Not sure its original intention, but the observations parse method in repos/sgn/lib/CXGN/Phenotypes/ParseUpload/Plugin/Observations.pm only throws an error for non-unique, observation unit, variable, timestamp combos. So it seems like if it started as a bug, somewhere along the line it turned into a feature.

bellerbrock commented 3 years ago

Definitely a complicated one. Historically, multiple measurements over time has usually been handled via trait ontologies. First built into the ontology itself (CBSD 3-month eval, 6-month eval, 9-month eval, etc.), then factored out into independent time ontologies (day1-day365, week1-week52, month1-month12 ) that can be combined with the core ontology via 'Compose a new trait'.

Until recently most data has not had timestamps attached, and from what I can tell the default behavior (that repeat observations of the same trait would be stored if the user didn't chose to overwrite) was an oversight/long-running bug. It caused strange behavior like negative values in the summary stats %missing column on trial detail pages, and was not accommodated for by any analysis tools or download formats.

I like the idea of handling things differently though. Seems like there's a lot of demand for storing repeated measurements differentiated just via timestamp, especially with the fieldbook -breedbase api connection making observation submissions with timestamps so easy. It'll mean lots of changes though, to upload parsers, summary and analysis tools, and download formats.

Lukas is back at then end of next week, maybe we can discuss in more detail then?

MFlores2021 commented 3 years ago

Sounds like a good idea. Let me know when works better

bellerbrock commented 3 years ago

Features from the discussion today:

Upload:

upload decisions

Display:

Download:

hkmanching commented 3 months ago

@lukasmueller I uploaded a phenotype file (in fieldbook export format) that had multiple timestamps. I realized it only kept the data with the first timestamp. I went back and split the data by date to upload individually and I was able to upload the second timestamp data okay, however, the 3rd timestamp through an error (see attachment) and I can not upload anymore with out this error. I also still can't delete trials or trait data in the database, so I can't go back and try again. I think this relates to this issue.

Screenshot 2024-07-04 at 11 20 40 AM