texttechnologylab / GerParCor

German Parliamentary Corpus (GerParCor)
GNU Affero General Public License v3.0
23 stars 7 forks source link

Other formats/ conversion to CSV or "classic" XML #3

Open cgnguyen opened 2 years ago

cgnguyen commented 2 years ago

Thank you so much for this treasure trove of data. Unfortunately, I am having a very hard time getting the to be accessible for my usual analysis (R+ Quanteda). I have tried to catch up with XMI, but I was wondering if you either plan to also provide alternative formats or have some suggestions how I can easily convert these documents into a different file format?

stefan-mueller commented 1 year ago

I have the same question as @cgnguyen. The text corpora and documentation are fantastic, but I need help with the XMI format. Do you have instructions on how to load these data and – ideally – how to convert them to different file formats?

matzefrey commented 1 year ago

Does anyone have already a solution for this or another source to Landtag protocols in a machine-readable format?