solgenomics / sgn

The code behind the Sol Genomics Network, Cassavabase and other Breedbase websites
https://solgenomics.net
MIT License
66 stars 35 forks source link

Missing Trait Data from materialized_phenotype_jsonb_table #3616

Closed dwaring87 closed 3 years ago

dwaring87 commented 3 years ago

Expected Behavior

Example of the problem:

From the trial detail page of a trial with 5 traits sampled:

Possible Reason Why:

The materialized_phenoview table correctly has the data summarized.

Example of a single plot from a problematic trial - materialized_phenoview with phenotype joined (includes all 5 traits):

SELECT * FROM materialized_phenoview 
LEFT JOIN phenotype USING (phenotype_id)
WHERE trial_id = 1307 AND stock_id = 292530;
phenotype_id breeding_program_id location_id year_id trial_id accession_id seedlot_id stock_id trait_id uniquename observable_id attr_id value cvalue_id assay_id individual_id sp_person_id name create_date collect_date operator
1414336 326 26 2021 1307 84789 NULL 292530 77255 Stock: 292530, trait: Heading time - Julian date (JD) date: NA2021-06-04_03:56:24 operator = jrutkoski 77255 NULL 139 77255 NULL NULL NULL NULL 2021-06-04 03:56:33.841806 NULL jrutkoski
1419631 326 26 2021 1307 84789 NULL 292530 77413 Stock: 292530, trait: Plant height - cm date: NA2021-06-08_19:47:39 operator = lbmunaro 77413 NULL 106.68 77413 NULL NULL NULL NULL 2021-06-08 19:47:40.493816 NULL lbmunaro
1432887 326 26 2021 1307 84789 NULL 292530 77715 stock: 292530, trait: Grain moisture content - %, date: NA2021-07-11_02:44:11, operator: jrutkoski 77715 NULL 13.8 77715 NULL NULL NULL NULL 2021-07-11 02:44:14.140739 NULL jrutkoski
1432888 326 26 2021 1307 84789 NULL 292530 77427 stock: 292530, trait: Grain test weight - g/l, date: NA2021-07-11_02:44:11, operator: jrutkoski 77427 NULL 719.543735911438 77427 NULL NULL NULL NULL 2021-07-11 02:44:14.140739 NULL jrutkoski
1432889 326 26 2021 1307 84789 NULL 292530 77495 stock: 292530, trait: Grain yield - kg/ha, date: NA2021-07-11_02:44:11, operator: jrutkoski 77495 NULL 5138.72477 77495 NULL NULL NULL NULL 2021-07-11 02:44:14.140739 NULL jrutkoski
NULL 326 26 2021 1307 84789 NULL 292530 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL 326 26 2021 1307 84789 NULL 292530 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL

Example of the same trial and plot from materialized_phenotype_jsonb_table:

SELECT * FROM materialized_phenotype_jsonb_table 
WHERE trial_id = 1307 AND observationunit_stock_id = 292530;
observationunit_stock_id observationunit_uniquename observationunit_type_name germplasm_uniquename germplasm_stock_id rep block plot_number row_number col_number plant_number is_a_control notes trial_id trial_name trial_description plot_width plot_length field_size field_trial_is_planned_to_be_genotyped field_trial_is_planned_to_cross breeding_program_id breeding_program_name breeding_program_description year design location_id planting_date harvest_date folder_id folder_name folder_description seedlot_transaction seedlot_stock_id seedlot_uniquename seedlot_current_weight_gram seedlot_current_count seedlot_box_name treatments observations available_germplasm_seedlots
292530 DH_Urb_21-plot1305 plot Kaskaskia 84789 13 13 1305 (Operator: jrutkoski, Time: ) 1307 DH_Urb_21 DH Augmented Yield Trial at Urbana, IL [2021] 326 University of Illinois Wheat breeding program at the University of Illinois 2021 Augmented 26 {"2020-10-05T00:00:00","2020-10-05T00:00:00","","#"} {"No ManagementFactor": null} [{"value": "139", "outlier": null, "operator": "jrutkoski", "trait_id": 77255, "trait_name": "Heading time - Julian date (JD)|CO_321:0001233", "uniquename": "Stock: 292530, trait: Heading time - Julian date (JD) date: NA2021-06-04_03:56:24 operator = jrutkoski", "create_date": "2021-06-04T03:56:33.841806", "collect_date": null, "phenotype_id": 1414336, "associated_image_id": null, "associated_image_type": null, "phenotype_location_id": 26, "phenotype_location_name": "Urbana, IL", "associated_image_project_id": null, "associated_image_project_name": null, "associated_image_project_time_json": null}, {"value": "106.68", "outlier": null, "operator": "lbmunaro", "trait_id": 77413, "trait_name": "Plant height - cm|CO_321:0001301", "uniquename": "Stock: 292530, trait: Plant height - cm date: NA2021-06-08_19:47:39 operator = lbmunaro", "create_date": "2021-06-08T19:47:40.493816", "collect_date": null, "phenotype_id": 1419631, "associated_image_id": null, "associated_image_type": null, "phenotype_location_id": 26, "phenotype_location_name": "Urbana, IL", "associated_image_project_id": null, "associated_image_project_name": null, "associated_image_project_time_json": null}] []

Note that the observations JSON only includes 2 traits:

[
  {
    "value": "139",
    "outlier": null,
    "operator": "jrutkoski",
    "trait_id": 77255,
    "trait_name": "Heading time - Julian date (JD)|CO_321:0001233",
    "uniquename": "Stock: 292530, trait: Heading time - Julian date (JD) date: NA2021-06-04_03:56:24  operator = jrutkoski",
    "create_date": "2021-06-04T03:56:33.841806",
    "collect_date": null,
    "phenotype_id": 1414336,
    "associated_image_id": null,
    "associated_image_type": null,
    "phenotype_location_id": 26,
    "phenotype_location_name": "Urbana, IL",
    "associated_image_project_id": null,
    "associated_image_project_name": null,
    "associated_image_project_time_json": null
  },
  {
    "value": "106.68",
    "outlier": null,
    "operator": "lbmunaro",
    "trait_id": 77413,
    "trait_name": "Plant height - cm|CO_321:0001301",
    "uniquename": "Stock: 292530, trait: Plant height - cm date: NA2021-06-08_19:47:39  operator = lbmunaro",
    "create_date": "2021-06-08T19:47:40.493816",
    "collect_date": null,
    "phenotype_id": 1419631,
    "associated_image_id": null,
    "associated_image_type": null,
    "phenotype_location_id": 26,
    "phenotype_location_name": "Urbana, IL",
    "associated_image_project_id": null,
    "associated_image_project_name": null,
    "associated_image_project_time_json": null
  }
]

Temporary Workaround

Manually forcing the CXGN::Phenotypes::PhenotypeMatrix class to use Native search type instead of MaterializedViewTable fixes the problem of missing data.

diff --git a/lib/CXGN/Phenotypes/PhenotypeMatrix.pm b/lib/CXGN/Phenotypes/PhenotypeMatrix.pm
index 22916795e..04914e6d2 100644
--- a/lib/CXGN/Phenotypes/PhenotypeMatrix.pm
+++ b/lib/CXGN/Phenotypes/PhenotypeMatrix.pm
@@ -151,6 +151,9 @@ sub get_phenotype_matrix {
     my $include_pedigree_parents = $self->include_pedigree_parents();
     my $include_timestamp = $self->include_timestamp;

+    # TODO: REMOVE WHEN BUG IS FIXED
+    $self->search_type("Native");
+
     print STDERR "GET PHENOMATRIX ".$self->search_type."\n";

     my $phenotypes_search = CXGN::Phenotypes::SearchFactory->instantiate(

For Bugs:

Environment

Steps to Reproduce

dwaring87 commented 3 years ago

After looking into this some more, it looks like the materialized_phenotype_jsonb_table does not get properly refreshed after adding trait data to a trial.

The materialized_phenotype_jsonb_table only gets refreshed when the refresh_matviews function/script gets called with the phenotypes option. Currently, the only time the script is used with this option is in the BrAPI v2 store observations function.