code-kern-ai / refinery

The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
https://www.kern.ai
Apache License 2.0
1.39k stars 66 forks source link

[BUG] - Linebreaks are not displayed correctly in the labeling view #256

Closed DerKernigeFeuerpfeil closed 1 year ago

DerKernigeFeuerpfeil commented 1 year ago

Describe the bug I have a pandas dataframe that has an attribute "email" which are full-length E-Mails. These E-Mails have a typical structure with a greeting, a body, an ending and a footer. The nice formatting is achieved with "\n" characters in my Python strings. When I export my data to csv (quoting=1) or to json using the pandas functions and then upload it to refinery, the formatting gets partially lost. The displayed E-Mails are now just a single blob without any linebreaks whatsoever.

After testing with a minimal example, I found that "\n" is displayed correctly, but "\n\n" does not display ANY linebreak at all.

To Reproduce Steps to reproduce the behavior:

  1. Go to refinery
  2. Upload my sample data as a new project
  3. Create Extraction labeling task
  4. Go to labeling view and see error

Expected behavior All linebreaks should be displayed correctly, no matter how many are chained.

Screenshots image

Desktop (please complete the following information):

EXAMPLE.json

[{"Summary":"Hello there,\n\nGeneral Kenobi\n\nFOOTER\nFooter2\nFooter3\nFooter4\n\nFooter5"}]