Closed janpf closed 3 years ago
That is a good suggestion, I will do it later.
I add two datasets from ABSA-Reproducibility few days ago, but I didnt find easy-to-adapt L15 and 16 datasets, can you give me a reference? Thank you for you help!
I add two datasets from ABSA-Reproducibility few days ago
Thanks!
but I didnt find easy-to-adapt L15 and 16 datasets, can you give me a reference?
Sure! I found them on the original task pages: https://alt.qcri.org/semeval2015/task12/index.php?id=data-and-tools https://alt.qcri.org/semeval2016/task5/index.php?id=data-and-tools
Where did you find the other datasets? Is there another source?
I add the Thirt and Television datasets from the link of ABSA-Reproducibility.
thanks! yet, these datasets are not processed into the recommended format, can you share the processed datasets instead? since I am working on other topics, I may not be able to process the dataset in time.
I add the Thirt and Television datasets from the link of ABSA-Reproducibility.
Thanks!
thanks! yet, these datasets are not processed into the recommended format, can you share the processed datasets instead?
Ah, I thought you had a converter script for semeval-format
=> your format
.
Would you mind sharing, where you got your datasets from, if not from semeval? Maybe they have a script? ;)
Unfortunately we have to do reformat by self-coding. As far as I known, there is not a script can do this for us.
@yangheng95 can you check once if my dataset format for SemEval2016 Task5 Subtask1 for APC in Dutch language is correct or not. I would like to share other multilingual datasets to your repositories as well. Attaching the file for your reference SemEval.Dutch.train.apc.txt
@yangheng95 can you check once if my dataset format for SemEval2016 Task5 Subtask1 for APC in Dutch language is correct or not. I would like to share other multilingual datasets to your repositories as well. Attaching the file for your reference SemEval.Dutch.train.apc.txt
Hello, Thanks for your sharing. The format is correct, and the polarity labels are valid. You can PR your datasets with copyrights information, e.g., source and processed by who. I will merge it and register the dataset in PyABSA after the necessary test and conversion to ATEPC format. Thanks again.
I will close this issue because it is inactive for 3 weeks.
Hi, Would you mind adding Laptop 15, 16 and Hotel from SemEval? As those are identically formatted to Restaurant 15, 16 I think they should import rather cleanly 😄
Additionally I'd suggest adding the datasets from https://github.com/rajdeep345/ABSA-Reproducibility/tree/main/code/datasets/semeval14, as those are already in identical format (as far as I can tell) and "only" the ATE part is missing.
Thanks for your work!