EricFillion / happy-transformer

Happy Transformer makes it easy to fine-tune and perform inference with NLP Transformer models.
http://happytransformer.com
Apache License 2.0
517 stars 66 forks source link

Fix removed newline for text generation training #297

Closed EricFillion closed 2 years ago

EricFillion commented 2 years ago

Closes #283

Hugging Face's load_dataset() function separates by newline by default and removes the newlines. Now, newlines are added back within the preprocess_concatenate() function.

AbdelrhmanNile commented 2 years ago

why is it not merged yet?

EricFillion commented 2 years ago

why is it not merged yet?

For some reason one of the test cases is hanging. I avoid merging unless the test cases pass. I'll look into it some more this weekend. Please let me know if you have any ideas.