tdwg / tnc

Taxonomic Names and Concepts Interest Group
22 stars 7 forks source link

Use of Data Packages #31

Closed nielsklazenga closed 4 years ago

nielsklazenga commented 5 years ago

@mdoering mentioned the use of Data Packages in issue #1, which is now closed.

Just elevating this to a separate issue, to keep it on the radar.

@mdoering, still interested in this?

mdoering commented 5 years ago

I am very much interested in an application of the standard that allows to exchange data in CSV files as we can with DwC. Data packages appear to be the strongest candidate for an existing standard in that area to jump on to. But if TCS NG becomes a RDF only standard I am frankly rather disappointed.

baskaufs commented 5 years ago

@mdoering I just gave Data Packages a look and it's really interesting. Here are a few relevant points:

nielsklazenga commented 5 years ago

@mdoering, that is not at all the intention; we intent to make a specification that is broadly applicable, and not just because that's what the Vocabulary Maintenance Specification requires from TDWG standards. Your use case is definitely a very important one and is very much on our radar. If a lot of the examples were in Turtle that is just because it is easy to read and write and useful to quickly get an idea across. Speaking for myself, when I think about data models I think tables in a database and if I were to produce RDF with real data, it would be something else first (database tables) and there would be something else again (JSON) between the database table and the RDF..

The models we discussed so far may look complicated, but we really have been talking about only two classes/tables, so they aren't really (and I think everything you get into a database structure you can get into CSV). I am pretty sure I could shoehorn everything we discussed so far into the Darwin Core Taxon class if I had a good crack at it. I might just do that (not right now). The use I see for Label objects is in the interface between identifications and taxonomic names.

If I recall correctly, you were the one who suggested we should look at a domain model. We are definitely have a much closer look at serialization. I created this issue because I thought it would be good to have on the radar for the time when we really start looking at that (and because I had a look when you first mentioned it and it looked promising), but maybe it turns out timely to keep our eyes on the ball. Will try to use more different ways to show examples. To be fair, @baskaufs had CSV examples in the document in which he was spruiking the use of SKOS-XL.

nielsklazenga commented 5 years ago

Sorry @baskaufs, I had a better look at your post just now and see that you had already addressed pretty much everything that I just did.

If I provide you with a set of CSV files with data from a taxonomic revision, perhaps even in the form of a Data Package, would you be interested to do your Guid-O-Matic thing on it? I would be interested to see the result, both the CSV and the RDF.

baskaufs commented 5 years ago

Sure, I can give it a go. We just need to do some mapping of column headers to property URIs. The URIs can just be made-up; it doesn't matter if they are "real" or not.

nielsklazenga commented 5 years ago

Thanks @baskaufs. The time it took me to create my example for issue #30 made me realise it will take some time for me to get all the data together.

baskaufs commented 5 years ago

No problem. Just let me know...

nielsklazenga commented 4 years ago

I have added an example Data Package to the examples in this repository: /examples/datapackage.