Closed balmas closed 3 years ago
I want to be able to allow cts urns in metadata in input files, and we'll need to code a tokenization rule for them so that spacy doesn't treat the : and . in them as puncutation.
We should probably add similar exceptions for http uris.
I want to be able to allow cts urns in metadata in input files, and we'll need to code a tokenization rule for them so that spacy doesn't treat the : and . in them as puncutation.
We should probably add similar exceptions for http uris.