code-kern-ai / refinery

The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
https://www.kern.ai
Apache License 2.0
1.39k stars 66 forks source link

[BUG] - Labels that are just numbers can't be imported #250

Closed JWittmeyer closed 1 year ago

JWittmeyer commented 1 year ago

Describe the bug Uploading labels in numerical form results in an error on the import.

To Reproduce Steps to reproduce the behavior:

  1. Go to new project
  2. Select a file with labels as numbers (e.g. 1, 2, 3)
  3. Upload file
  4. See error

Expected behavior Labels are imported as "1", "2", "3"

Desktop (please complete the following information):

Additional context Example file with the issue example_numbered labels.zip

Example error log:

Traceback (most recent call last):
  File "/app/./api/transfer.py", line 70, in post
    init_file_import(task, project_id, is_global_update)
  File "/app/./api/transfer.py", line 236, in init_file_import
    transfer_manager.import_records_from_file(project_id, task)
  File "/app/./controller/transfer/manager.py", line 69, in import_records_from_file
    import_file(project_id, task)
  File "/app/./controller/transfer/record_transfer_manager.py", line 134, in import_file
    import_records_and_rlas(
  File "/app/./controller/transfer/record_transfer_manager.py", line 66, in import_records_and_rlas
    ) = split_record_data_and_label_data(chunk)
  File "/app/./controller/transfer/record_transfer_manager.py", line 270, in split_record_data_and_label_data
    item.strip() == ""
AttributeError: 'float' object has no attribute 'strip'
jhoetter commented 1 year ago

str(item) should (hopefully) do the trick :)

edit: didn't see your PR; go ahead with your solution :)