Breeding-Insight / sgn

The code behind the Sol Genomics Network, Cassavabase and BreeDBase websites
https://solgenomics.net
MIT License
2 stars 0 forks source link

[BI-1843] - fix BrAPI observation search endpoint #122

Closed mlm483 closed 1 year ago

mlm483 commented 1 year ago

Description

Jira bug: https://breedinginsight.atlassian.net/browse/BI-1843

Observations (phenotypic data) were missing from exports when BreedBase was the backing service. The cause was that the Observations search endpoint wasn't returning any data. BreedBase was interpreting the trialDbIds sent in the search POST body as both Trial and Study DbIds. This is likely due to conflicting terminology between BrAPI and BreedBase: https://gist.github.com/mlm483/b5e1590fdedd0c3dcd116bd29ef19423.

Testing

  1. Upload an experiment with Observations (phenotypic data) to a BreedBase-backed program.
  2. Download the experiment's Observation dataset.
  3. Check that downloaded file has Observations (phenotypic data).

Alternatively:

  1. Use this query to find valid folder and program DbIds:

    SELECT germplasm_uniquename, trial_id, trial_name, breeding_program_id, breeding_program_name, folder_id, folder_name, observations
    FROM materialized_phenotype_jsonb_table
    LEFT JOIN (
    select stock.stock_id, array_agg(db.name)::text[] as xref_sources, array_agg(dbxref.accession)::text[] as xref_ids
    from stock
             join stock_dbxref sd on stock.stock_id = sd.stock_id
             join dbxref on sd.dbxref_id = dbxref.dbxref_id
             join db on dbxref.db_id = db.db_id
    group by stock.stock_id
    ) xref on xref.stock_id = materialized_phenotype_jsonb_table.observationunit_stock_id
    ;
  2. POST to the search endpoint with the folder and program DbIds from step 1.

    curl --location 'http://localhost:7080/brapi/v2/search/observations/?pageSize=1000&page=0' \
    --header 'Content-Type: application/json' \
    --data '{ "programDbIds": ["{programDbId}"], "trialDbIds": ["{trialDbId}"], "pageSize": 10000000 }'
  3. GET the search result endpoint with the searchResultDbId returned in step 2.

    curl --location 'http://localhost:7080/brapi/v2/search/observations/{searchResultDbId}?pageSize=1000&page=0'

    The benefit of this approach is that you can try adding "studyDbIds": ["{studyDbId}"], to the POST call, to ensure the endpoint works as expected for search by Study as well.

Checklist