Closed ghost closed 5 years ago
Hi, The output should indeed be a label per document. Could you please check if the first column in your dataset is annotator ID, and the second is document ID, and not the reverse? If it is correct, could you please provide a small example input where the problem arises? Thanks!
Yes, that was the problem. Thanks! I was generating a csv file with the input of fast_dawid_skene.py using the csv
package, but the column order I specified was not preserved. Now my correct header is
annotator_id,question_id,annotation_id
and the output of the algorithm is OK. The answers here helped: https://stackoverflow.com/questions/15653688/preserving-column-order-in-python-pandas-dataframe
Thanks for the update. To clarify, does the code in the repository or the pandas version used have a bug in conversion to CSV? The link you shared seems to suggest that there should be no issues in pandas 0.20.2. However, if there is any please let us know. Thanks!
Closing since the issue was resolved. Please comment/reopen if the problem persists. Thanks.
Hey, I am currently running the command
python scripts/fast_dawid_skene.py --dataset toy --mode aggregate --algorithm FDS --print_result
successfully for my dataset, but when I check the results (either in the output file or the shell output), I see one final annotation per annotator as a result. I thought that one label per document will be created. I checked what happens at the end ofmain.py
for printing and also the code inutils.to_csv
, but I am unsure what to change to make it work. Could you please explain how to get the final labels? Thanks!