swerik-project / the-swedish-parliament-corpus

A repository for managing public, versioned releases of the Swedish Parliament Corpus.
4 stars 0 forks source link

Proposed API structure #15

Closed MansMeg closed 3 weeks ago

MansMeg commented 1 month ago

Hi!

Here is my proposed structure of the data repository apis. Feel free to comment.

BobBorges commented 1 month ago

Quality estimation is also a kind of test and uses similar conceptual workflows. I'm not sure I see the point in a separate folder. Instead:

test/                    # quality estimation code goes here along with integrity tests
test/data                # data goes here
test/quality-estimates   # versioned quality estimations
ninpnin commented 1 month ago

I would keep the gold standard data in its own folder.. test is very generic and only partially applicable

MansMeg commented 1 month ago

Ok. Whats your suggested solution Väinö?

ninpnin commented 1 month ago

Add a gold-standard folder

MansMeg commented 1 month ago

So your suggestion is like this (I called gold standard data quality-data):

test/                    # quality estimation code goes here along with integrity tests
test/test-data                # data used in data integration tests goes here
test/quality-estimates   # versioned quality estimations
test/quality-data   # data used for quality estimation goes here
fredrik1984 commented 1 month ago

@ninpnin what do you think of this Väinö?

ninpnin commented 1 month ago

Sorry this is very late, but I would have the gold standard completely outside the test folder. What do you think?

MansMeg commented 1 month ago

How would you store it?

ninpnin commented 1 month ago

Have a quality-data folder in the project root, at the same level as test etc. But I'm not necessarily planning to die on this hill

MansMeg commented 1 month ago

I agree with you here. So maybe have something like this:

/quality/... -> scripts used for quality estimation /quality/data/... -> data used for quality estimation /quality/estimates/... -> estimates by version stored for easy access

ie mirror the test folder structure.

MansMeg commented 3 weeks ago

Now I have updated the file with the decision we discussed. @fredrik1984 , you can now merge this.