NASA-PDS / registry-api

Web API service for the PDS Registry, providing the implementation of the PDS Search API (https://github.com/nasa-pds/pds-api) for the PDS Registry.
https://nasa-pds.github.io/pds-api
Other
2 stars 5 forks source link

As a user, I don't want to see the default fields in properties structure when I use the fields= parameter #394

Open tloubrieu-jpl opened 7 months ago

tloubrieu-jpl commented 7 months ago

Checked for duplicates

Yes - I've already checked

πŸ§‘β€πŸ”¬ User Persona(s)

Data user

πŸ’ͺ Motivation

...so that I am not confused by the behavior of the fields parameter when I use the default response format application/json.

πŸ“– Additional Details

In following request, for example:

curl --get 'https://pds.nasa.gov/api/search/1/classes/collections' \
    --data-urlencode 'limit=10' \
    --data-urlencode 'q=(pds:Primary_Result_Summary.pds:processing_level eq "Raw")' \
  --data-urlencode 'fields=pds:Primary_Result_Summary.pds:processing_level' | json_pp

The properties structure of the response current contains:

 "properties" : {
            "lidvid" : [
               "urn:nasa:pds:mess_ns_raw:data_edr::1.0"
            ],
            "ops:Label_File_Info.ops:file_ref" : [
               "https://pds-geosciences.wustl.edu/messenger/mess-e_v_h-grns-2-ns-rawdata-v1/messns_1001/data/collection_data_edr.xml"
            ],
            "ops:Tracking_Meta.ops:archive_status" : [
               "archived"
            ],
            "pds:File.pds:creation_date_time" : [
               "2018-06-28T00:00:00Z"
            ],
            "pds:Modification_Detail.pds:modification_date" : [
               "2018-06-18T00:00:00Z"
            ],
            "pds:Primary_Result_Summary.pds:processing_level" : [
               "Raw"
            ],
            "pds:Time_Coordinates.pds:start_date_time" : [
               "2004-08-12T19:01:02Z"
            ],
            "pds:Time_Coordinates.pds:stop_date_time" : [
               "2015-04-30T18:53:47Z"
            ],
            "product_class" : [
               "Product_Collection"
            ],
            "ref_lid_instrument" : [
               "urn:nasa:pds:context:instrument:ns.mess"
            ],
            "ref_lid_instrument_host" : [
               "urn:nasa:pds:context:instrument_host:spacecraft.mess"
            ],
            "ref_lid_investigation" : [
               "urn:nasa:pds:context:investigation:mission.messenger"
            ],
            "ref_lid_target" : [
               "urn:nasa:pds:context:target:planet.earth",
               "urn:nasa:pds:context:target:planet.mercury",
               "urn:nasa:pds:context:target:planet.venus"
            ],
            "title" : [
               "MESSENGER NS Raw (EDR) Data Collection"
            ],
            "vid" : [
               "1.0"
            ]
         },

Now we only want:

 "properties" : {
 "pds:Primary_Result_Summary.pds:processing_level" : [
               "Raw"
            ],
}

The other fields comes from the defaults json structure where these values are needed (start time, title...). We don't wnat to see them anymore not to confuse the user.

Same applies to the summary.properties part of the response.

Acceptance Criteria

Given an API server When I perform a request with a fields parameter and a the default response format application/json Then I expect to only see the requested field in the properties substructures.

βš™οΈ Engineering Details

No response

scholes-ds commented 7 months ago

I can now see value of including the default core entry fields along with specifically requested fields, in some cases. Is it worth considering a flag, switch, parameter, etc, that indicates to include the primary identifying entry values along with specifically requested fields? Something like primaryFields, identifierField, default fields... We do something like this with the ODE API.

anilnatha commented 1 month ago

I like the way our OpenSearch EN Cluster was operating, in that if you don't specify which fields to return, every field is returned by default.

I was surprised when I had provided a fields param to our Search API and I was getting back more fields than I had specifically asked for. I think this is a detriment for app developers who want to optimize their queries for performance.

To keep it simple and give developers granular control and not add too much complexity to our Search API, I think an API calls should return everything by default, or only those fields provided by the fields param.