code-kern-ai / refinery

The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
https://www.kern.ai
Apache License 2.0
1.4k stars 68 forks source link

[BUG] - Manually uploaded labels not available when downloading records #224

Closed xavialex closed 1 year ago

xavialex commented 1 year ago

While uploading records with a label (with the format [feature__label]), downloading the records result in null labels. The field is_valid_manual_label is set to false. I'm leaving here a very simple test.csv to reproduce the behavior. Simply create a project from this file, then Settings -> Download records.

Extra info

See in Discord

JWittmeyer commented 1 year ago

Hi @xavialex,

we looked into it. You are right, the is_valid_manual_label field isn't set correctly for imported labels via the __

For your existing projects we have a few options:

  1. (and probably easiest) export the project via project snapshot and reimport it. This should trigger setting the correct value. Please keep in mind that you shouldn't delete the old project file until it's verified that the process worked to ensure your progress isn't lost.
  2. (only viable if you have an database browser since doing this via console would be a hassle) I could provide you with an update query though it's quite big.
  3. Not sure if you have access to http://localhost:4455/graphql/ (can't check atm). If yes we could work some magic there.

All in all, I would try the first approach and only decide on further steps depending on the result. For your example file this worked without any issues

xavialex commented 1 year ago

Hi @JWittmeyer, thanks. Not a real issue since Ibumped into this while making small tests, so I've no projects affected. I'll keep with the CSV files and upload them into new projects when the release takes care of this.