NASA-PDS / registry-api

Web API service for the PDS Registry, providing the implementation of the PDS Search API (https://github.com/nasa-pds/pds-api) for the PDS Registry.
https://nasa-pds.github.io/pds-api
Apache License 2.0
3 stars 5 forks source link

text/csv format is impacted by the repairkit script (apparently) #375

Closed tloubrieu-jpl closed 1 year ago

tloubrieu-jpl commented 1 year ago

Checked for duplicates

Yes - I've already checked

šŸ› Describe the bug

When I do a text/csv request on "ops:Label_File_Info.ops:file_ref"

curl 'https://pds.nasa.gov/api/search/1//products?limit=10&fields=lidvid&fields=title&fields=ops%3ALabel_File_Info.ops%3Afile_ref'  --header 'Accept:text/csv'

I am getting string results with square brakets around the URL,

See example response: """ lidvid,ops:Label_File_Info.ops:file_ref,title "urn:nasa:pds:epoxi_mri:hartley2_photometry:aper_phot::1.0","[https://pds-smallbodies.astro.umd.edu/holdings/pds4-epoxi_mri-v1.0/hartley2_photometry/data/aper_phot.xml]","Small Aperture Photometry" "urn:nasa:pds:epoxi_mri:hartley2_photometry:profile::1.0","[https://pds-smallbodies.astro.umd.edu/holdings/pds4-epoxi_mri-v1.0/hartley2_photometry/data/profile.xml]","Surface Brightness Profiles" "urn:nasa:pds:epoxi_mri:hartley2_photometry:hartley2_mri_anomaly::1.0","[https://pds-smallbodies.astro.umd.edu/holdings/pds4-epoxi_mri-v1.0/hartley2_photometry/document/hartley2_mri_anomaly.xml]","Anomalous behavior of the MRI during EPOXI approach to 103P/Hartley 2" "urn:nasa:pds:compil-comet:nuc_properties:extinct::1.0","[https://pds-smallbodies.astro.umd.edu/holdings/pds4-compil-comet-v1.0/nuc_properties/data/extinct.xml]","Physical Properties of Extinct Comets" "urn:nasa:pds:gbo-ctio:cfccd-19p:description::1.0","[https://pds-smallbodies.astro.umd.edu/holdings/pds4-gbo-ctio-v1.0/cfccd-19p/description.xml]","CTIO Images of 19P/Borrelly with Photometry Overview" """

šŸ•µļø Expected behavior

I expected not to have the brackets inside the double quotes but outside.

šŸ“œ To Reproduce

Run a request with Accept header = text/csv

Have fields=ops:Label_File_Info.ops:file_ref

That might happen on other fields but I did not check.

šŸ–„ Environment Info

Test on the product PDS API

šŸ“š Version of Software Used

1.2

šŸ©ŗ Test Data / Additional context

No response

šŸ¦„ Related requirements

šŸ¦„ #xyz

āš™ļø Engineering Details

No response

alexdunnjpl commented 1 year ago

@tloubrieu-jpl there might be some confusion about correct behaviour here.

The double-quotes aren't there because they're enclosing string-typed values (as would be the case in json like the following)

["someStr", "someOtherStr"]

Rather, they're there to wrap the CSV field value, like the second line of the following:

field1, field2
"value1", "value2"

So the behaviour isn't incorrect, prima facie - whatever the desired behaviour is, it isn't that the url field values should be like ["someUrl"]


The newly-present square brackets are present because the value of the field was a string, and now it's an array containing a string.

I don't see a viable way for the serializer to determine which fields are "single values wrapped in an array because everything is wrapped in an array now" vs "a collection which happens to have only one element in it", so getting rid of the square brackets automatically for the former case isn't really a possibility.

So I guess the question is "when serializing as csv, what is the intended behaviour for fields having array input types, and does that intent change depending on the types of the elements in the array (probably not)?"

tloubrieu-jpl commented 1 year ago

Thanks @alexdunnjpl , I think we need to find a better way to expose these values in the CSV because that would not make a local of sense to the users. We can discuss that at a next breakout.

tloubrieu-jpl commented 1 year ago

We chose to use the | as an internal delimiter for the multi-valued fields. We don't whant square brackets anymore around the list.

alexdunnjpl commented 1 year ago

Serialization root is this call to String.valueOf()

WyriwygBusinessObject is used by virtue of being returned by this call to this.formatters.get(this.format)

~It is unclear what about WyriwygBusinessObject is specific to text/csv representation, or what else may rely upon it and its existing behavior, but ~it is not used in the default/json serialization process, which instead uses PdsProductBusinessObject.

~Need to confirm that there will be no undesirable side-effects from modifying the behaviour of WyriwygBusinessObject~

@tloubrieu-jpl I've tracked dependency on that serializer to application/csv, application/kvp+json, and text/csv. Modification of WyriwygBusinessObject will affect all three of these formats - is that acceptable? If not, I'll subclass it and use it just for text/csv (and application/csv, I assume?)

tloubrieu-jpl commented 1 year ago

Hi @alexdunnjpl ,

Yes, I noted that application/kvp+json was also negatively impacted by the sweepers/repairkit change, so that is fine if your update fixes also this format.

jordanpadams commented 1 year ago

Status: PR Pending