Closed marcelbusch closed 1 year ago
@marcelbusch do you know for sure that you are the only one working on this. What version of the server and Python package are you using?
I think it might be good to evaluate the query syntax w.r.t. the query bahaviour. Are you using it correctly?
Yes, I am sure I'm the only one working on this. Server and python package version are both 1.10.0
I'm attaching a screenshot to clearify which behaviour I mean: I labeled and validated a record (so now there is 2 versions of it in my dataset) now I query for a combination of words that only exists in this specific record, naturally I get both versions, in python via rg.load, and in the UI it also shows that there are 2 records at the bottom, but I can only see one of them
@marcelbusch described in our query syntax overview you would need to query a bit differently in that case "nurnberg AND demonstration AND wegstrecke" should be what you intend to achieve, correct?
w.r.t. the duplicating records, is that still happening? Could you show me a video?
No, when I connect them with AND it's still the same behaviour...
Yes that's still happening, here's a video:
https://github.com/argilla-io/argilla/assets/7290487/b73778ab-2c40-44e4-bc84-5f093c036ab5
@frascuchon
Thanks @marcelbusch, this really helps us.
Hi @marcelbusch, we cannot reproduce this behaviour. Could your provide us with more context w.r.t. your data? Also, could you run pip install argilla --update
, delete your current docker images, and set their tag specific to 1.10.0 during re-deployment?
Thanks for your feedback @marcelbusch! Is also happening when you validate a single record (without bulk validation action)
Can you share also the extra record info (you can access it by clicking the ...
button) for both records?
Could your provide us with more context w.r.t. your data?
I could send you a csv of my pandas dataframe, or what do you mean?
Also, could you run
pip install argilla --update
, delete your current docker images, and set their tag specific to 1.10.0 during re-deployment?
So I should run server version 1.10.0 with argilla python version 1.11.0? And the docker tag would be v1.10.0
or releases-1.10.0
?
Is also happening when you validate a single record (without bulk validation action)
Yes, same thing when using the validate button at the bottom and when setting records per page to 1
Can you share also the extra record info (you can access it by clicking the
...
button) for both records?>
They both have the same id, could this be because i set the id manually during record creation?
I took a look in my debugger at this point: updateDatasetRecords and in the record that get's passed in there when I click on a label the id is incorrect, in fact it seems to be rounded to the nearest 100: the record has an initial id of 1478720431326732291, the record that gets passed in the updateDatasetRecords function has the id 1478720431326732300
Thanks for all this info @marcelbusch. It's really helpful. We'll work on fixing that.
As a temporal workaround, I think you can set the record id as a string instead of a number. This should avoid this problem.
I've been doing some test and it looks like is a javascript limit
Using curl:
curl -X 'POST' \
"http://localhost:6900/api/datasets/test-dataset/TextClassification:search?include_metrics=false&workspace=argilla&limit=50&from=0" \
-H 'accept: application/json' \
-H 'X-Argilla-Api-Key: argilla.apikey' \
-H 'Content-Type: application/json' \
-d '{}'
{"total":1,"records":[{"id":10805720881385292014,"status":"Default","metrics":{},"last_updated":"2023-06-30T13:23:45.462574","inputs":{"additionalProp1":"string","additionalProp3":"string","additionalProp2":"string"},"multi_label":true}],"aggregations":{"predicted_as":{},"annotated_as":{},"annotated_by":{},"predicted_by":{},"status":{"Default":1},"predicted":{},"score":{},"words":{"string":1},"metadata":{}}}
From UI:
@leiyre @damianpumar @keithCuniah Any ideas?
Describe the bug In multi-label text classification, whenever I annotate records, they get copied to a validated version, but the original record stays the same. So I have a dataset with 1000 records, and after annotating 20 of them, I now have 1020 records in my dataset, the 1000 records with status "default" and the 20 annotated records. Is this expected behaviour? Because like this I can't filter for records I haven't annotated yet, when I filter for status:default I still get all 1000 records.
I run argilla and elasticsearch with the supplied docker-compose file without any modifications.
To Reproduce Steps to reproduce the behavior:
Expected behavior I expect the original records to change status, so after annotating 20 records I should still have 1000 records in my dataset, now 980 with status "default" and 20 with status "validated"
Environment (please complete the following information):
Edit: Update on some more behaviour:
translateY
)