Open ulfgebhardt opened 3 years ago
There should be a README.md
always IMHO.
I suggest separate repositories for separate data sets (bgbl
, banz
, ...).
The repos should have proper naming - "banz" has no meaning at all. Event tho I say have english names "Bundesanzeiger" as Entity-name is acceptable I guess and the reader understands what the repo is about
We may not even need a separate repo. Using a separate branches would probably already cover most of the way. Then we should have a cleaned-up version of the tools branch that omits all the data commits, so that the tools themselves are quick to clone.
:rocket: Feature
It is common practice that scraper and data is stored separately, but here this is not the case - or at least partly.
We have a data folder containing jsons: https://github.com/bundestag/gesetze-tools/tree/master/data
But there is a repo associated with this scraper as well: https://github.com/bundestag/gesetze
It is still unclear to me how the tool produces the output stored in the gesetze repo.
Nevertheless I consider it useful to have all data separated from the tools creating them. I think it would be wise to create a new repo for the scraped data (please in English)
Design & Layout
Data in a data-repo should be stored in a data folder