cedardevs / onestop

OneStop is a data discovery system being built by CIRES researchers on a grant from the NOAA National Centers for Environmental Information. We welcome contributions from the community!
GNU General Public License v2.0
45 stars 20 forks source link

Bug in onestop-client avro schemas #1508

Open erinreeves opened 3 years ago

erinreeves commented 3 years ago

Bug Description A few of the details with the python-client avro schemas seem questionable. Need clarity on what's correct and fix either unit tests or schemas.

Severity Severity: MEDIUM (High/Medium/Low) Priority: MEDIUM (High/Medium/Low)

To Reproduce Steps to reproduce the behavior:

  1. Run test_ParsedRecord.py unit test on this branch: 1508-avro-schemas

Expected Result All tests should pass

Actual Result ERROR: test_fileLocations (test.unit.schemas.psiSchemaClasses.org.cedar.schemas.avro.psi.test_ParsedRecord.test_ParsedRecord) TypeError: Superfluous parameters in call: {'s3://noaa-goes16/ABI-L1b-RadF/2019/303/09/OR_ABI-L1b-RadF-M6C10_G16_s20193030950389_e20193031000109_c20193031000158.nc'}

ERROR: test_relationships (test.unit.schemas.psiSchemaClasses.org.cedar.schemas.avro.psi.test_ParsedRecord.test_ParsedRecord) TypeError: Key type has incorrect type: str instead of Optional[RelationshipType].

FAIL: test_discovery (test.unit.schemas.psiSchemaClasses.org.cedar.schemas.avro.psi.test_ParsedRecord.test_ParsedRecord) assert len(args) == 2 AssertionError

Additional context Found bug by running sme/sme.py script with data loaded in local OneStop from OneStop-test-data repo (GOES triggered the errors mostly)

erinreeves commented 3 years ago

Relationships tested with scripts/launch_pyconsumer.py using copy of /etc/config/config.yml with log_level=Debug:

Old payload web publisher, after transform "relationships": [{"type": {"type": "COLLECTION"}, "id": "fdb56230-87f4-49f2-ab83-104cfd073177"}], "errors": [{"title": null, "detail": null, "status": null, "code": null, "source": null}]}

New: "relationships": [{"type": "COLLECTION", "id": "fdb56230-87f4-49f2-ab83-104cfd073177"}]

relationships=[Relationship(type='COLLECTION', id='fdb56230-87f4-49f2-ab83-104cfd073177')]

Note, manually did this but added to next fixing story: needed try/except around get_uuid_metadata with a return in except since some records were causing "(404) when calling the HeadObject operation: Not Found" 'cuz I think they didn't exist, were deleted records.

erinreeves commented 3 years ago

Also changed SqsHandlers and S3MessageAdapter to operate/transform on a single record, and only take 1 in as a parameter. Since it already only operated on 1 record.

erinreeves commented 3 years ago

1508-avro-schemas branch

erinreeves commented 3 years ago

https://github.com/cedardevs/onestop-clients/pull/60