sillsdev / silnlp

A set of pipelines for performing experiments on various NLP tasks with a focus on resource-poor/minority languages.
Other
30 stars 3 forks source link

Adding generation of (train/val/test).txt files to XRI etl script #493

Closed rminsil closed 3 weeks ago

rminsil commented 1 month ago

This PR extends the work in https://github.com/sillsdev/silnlp/pull/491.

That PR added generation of all.txt files. This PR further adds generation of (train/val/test).txt files.

Note that currently there is no data transformation logic, that will be added in later PR's.


Note that this PR currently targets branch issue-473-add-file-saving and not master as I'm waiting for https://github.com/sillsdev/silnlp/pull/491 to be approved and merged, so I continued working targeting my first branch. When 491 gets merged I'll change this PR to target master.


This change is Reviewable