Open peterdesmet opened 7 years ago
From what I know of both, they are certainly working in similar spaces, so there are parallel thought processes going on. The TDWG work is targeting a vocabulary and set of tests specifically to apply to DwC terms (started around 3 years ago), with the hope that it will be picked up by GBIF, VertNet and others and applied consistently. The TDWG work will likely not go as far as to define an implementation approach, and leave that as an exercise for the reader. Whip goes a lot further and strictly defines a structure for implementation but is not tied to DwC in any way, nor does it define tests itself. My understanding is that an implementation of the TDWG TG outputs could be produced in Whip or other e.g. EBay Griffin.
For info: GBIF will be trying to work primarily with ALA and hopefully others to standardise interpretation of data in the ingestion routines and will likely use something yet to be decided but compatible with Apache Spark.
Lee Belbin send an email on March 3 to the TDWG Biodiversity Data Quality IG group regarding the work of WG2: Data Quality Tests and Assertions:
I'm not familiar with this output, but @cgendreau @tucotuco you're both highlighted as contributing members for this. Would you care to explain the scope of these tests/assertions vs whip? How are the approaches different and what is the chance we're duplicating efforts?