Matatika / tap-shopify

This Shopify tap produces JSON-formatted data following the Singer spec.
GNU Affero General Public License v3.0
3 stars 12 forks source link

Schema errors for validation on required properties. #12

Open konung opened 11 months ago

konung commented 11 months ago

Hi

I seem to be running into a ton of errors with schema. Basically complaining about required values when they come in empty from shopify.

I ended up downloadint catalog and marking up a bunch of values that are expected as strings, integers, objects, etc as nullible .ie. "type": [ "object", "null" ], instead of just "type": [ "object"]

          Also lots of mismatches of object vs array type on things like "tax_lines" for example. 

I can't imagine everybody is doing it manually. Am I missing something?

Thank you!

konung commented 11 months ago

Just to clarify , it's related to issue #8 , but I had to make even more changes to the schema, and in a few spots the type was outright incorrect - "string" instead of "integer" or "object" instead of "array".

ReubenFrankel commented 11 months ago

Agreed - schemas definitely need updating. Will try and bump priority on this.

konung commented 11 months ago

HI @ReubenFrankel You mentioned on another ticket that you are testing daily in your CI/CD, correct?

I have a feeling is that you are testing against a sample store with fairly recent data?

I'm trying to import data for a store, going all the way back to 2014 ( especially things like orders) , where many fields didn't exist, and some data was entered haphazardly, and there are going to be empty many empty fields. Schema's validating for non-empty values or expecting objects where old records have strings, is what I think is causing this.

ReubenFrankel commented 11 months ago

@konung Yes, we have a test store that was created when we first built the tap.

Thanks for the info - I think one for the key requirements here is to make all schema properties nullable by default (this is the default behaviour when you define a schema using the Meltano SDK typing classes, so I'd imagine we want to move towards that), which will solve this and a lot of similar issues.

ReubenFrankel commented 11 months ago

@konung Can you be more specific about any properties in particular you were running into issues with? There have been a lot of schema changes in #14, but unsure if they completely address this issue in particular.

ReubenFrankel commented 11 months ago

@konung As of the latest commit on #14, here are the notable differences with your schema:

Does anything here stand out to you as wrong? Can you show what, for example, abandoned_checkout.discount_codes looks like where it is coming back as an object, array, string or null (as opposed to just an array, as the schema is currently expecting)?

ReubenFrankel commented 11 months ago

Various schema issues addressed in v0.2.0