{
"label": "B-LOC B-LOC O B-PERS I-PERS O O O O B-PERS I-PERS O O O O O O O O O O O O O O O O O O B-LOC B-LOC O O O O O O O O O O O O B-PERS O O O O O O O O O O O O O O O O O O O O B-LOC B-LOC O O O O O O O",
"model_output": "Output: [('الصالحية', 'LOC'), ('المفرق', 'LOC'), ('-', 'O'), ('غيث', 'PER'), ('الطراونة', 'PER'), ('-', 'O'), ('أمر', 'O'), ('جلالة', 'O'), ('الملك', 'PER'), ('عبدالله', 'PER'), ('الثاني', 'PER'), ('أمس', 'O'), ('بتنفيذ', 'O'), ('حزمة', 'O'), ('من',... ('التحديات', 'O'), ('التي', 'O'), ('يواجهها', 'O'), ('أبناء', 'O'), ('الصالحية', 'LOC'), ('ونايفة', 'LOC'), ('خصوصا', 'O'), ('فيما', 'O'), ('يتعلق', 'O'), ('بمشكلتي', 'O'), ('الفقر', 'O'), ('والبطالة', 'O'), ('.', 'O')]",
}
Should we count the LOC as "B-LOC"? What about consecutive ones, should the first one be "B-" and second one "I-" (this is not always correct, like the first two tokens in the above; Up for discussion @firojalam @baselmousi
Thanks for bringing this up. I will prepare output files to compare labels and returned post-processed responses for both gpt-3.5 and gpt-4. Considering 5 labels instead of 9 will improve the results quite a bit.
Sometimes, the outputs are like:
Should we count the LOC as "B-LOC"? What about consecutive ones, should the first one be "B-" and second one "I-" (this is not always correct, like the first two tokens in the above; Up for discussion @firojalam @baselmousi