sukrutrao / Fast-Dawid-Skene

Code for the algorithms in the paper: Vaibhav B Sinha, Sukrut Rao, Vineeth N Balasubramanian. Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification. KDD WISDOM 2018
https://sites.google.com/view/fast-dawid-skene
MIT License
42 stars 11 forks source link

How to print one final label per data instance #6

Closed ghost closed 5 years ago

ghost commented 5 years ago

Hey, I am currently running the command python scripts/fast_dawid_skene.py --dataset toy --mode aggregate --algorithm FDS --print_result successfully for my dataset, but when I check the results (either in the output file or the shell output), I see one final annotation per annotator as a result. I thought that one label per document will be created. I checked what happens at the end of main.py for printing and also the code in utils.to_csv, but I am unsure what to change to make it work. Could you please explain how to get the final labels? Thanks!

sukrutrao commented 5 years ago

Hi, The output should indeed be a label per document. Could you please check if the first column in your dataset is annotator ID, and the second is document ID, and not the reverse? If it is correct, could you please provide a small example input where the problem arises? Thanks!

ghost commented 5 years ago

Yes, that was the problem. Thanks! I was generating a csv file with the input of fast_dawid_skene.py using the csv package, but the column order I specified was not preserved. Now my correct header is

annotator_id,question_id,annotation_id

and the output of the algorithm is OK. The answers here helped: https://stackoverflow.com/questions/15653688/preserving-column-order-in-python-pandas-dataframe

sukrutrao commented 5 years ago

Thanks for the update. To clarify, does the code in the repository or the pandas version used have a bug in conversion to CSV? The link you shared seems to suggest that there should be no issues in pandas 0.20.2. However, if there is any please let us know. Thanks!

sukrutrao commented 5 years ago

Closing since the issue was resolved. Please comment/reopen if the problem persists. Thanks.