this should allow us to remove the *_all.py versions of scripts; all scripts now accept any number of files or read directly from stdin. they also output a single file; no need to maintain the segmentation of the input. hopefully this will help obscure differences in segmentation between the JDSW and SBCK editions of texts.
[ ] sbck2csv (add file name as a column in output)
xml2conllu accepts a single file and thus remains unchanged.
see https://docs.python.org/3/library/fileinput.html, in particular methods like
fileinput.filename()
,fileinput.lineno()
, etc.this should allow us to remove the
*_all.py
versions of scripts; all scripts now accept any number of files or read directly from stdin. they also output a single file; no need to maintain the segmentation of the input. hopefully this will help obscure differences in segmentation between the JDSW and SBCK editions of texts.sbck2csv
(add file name as a column in output)xml2conllu
accepts a single file and thus remains unchanged.