hipe-eval / HIPE-pycommons

HIPE-commons is a python library with generic and reusable functionalities for the management of NE-annotated data around INCEpTION annotation platform.
GNU Affero General Public License v3.0
0 stars 2 forks source link

Various new tsv functionalities #3

Closed sven-nm closed 2 years ago

sven-nm commented 2 years ago
mromanello commented 2 years ago

Thanks a lot, @sven-nm! Would it be possible to add a test for tsv_to_torch_datasets? And also bump the version to 0.3.0.

sven-nm commented 2 years ago

@mromanello, I think its all good now, I let you have a final check before merging.

sven-nm commented 2 years ago

feature missing : #3 does not allow for cutting and recycling samples with a length superior to the model's max_length. To be fixed before merge.

Also : untokenize the dataset. tokenization should be done afterwards.

sven-nm commented 2 years ago

@mromanello @simon-clematide added just a few functionalities there and corrected the previous comment, now ready for a merge ;-)