sehsanm / embedding-benchmark

Word Embedding benchmark project By Shahid Beheshti University NLP Lab
GNU General Public License v3.0
6 stars 16 forks source link

adding analogy data set #29

Closed mohammadreza-molapanah closed 5 years ago

abb4s commented 5 years ago

we thought categories in test set have a hierarchy and all categories are subcategory of semantic or syntactic sets and perhaps you want to measure accuracy on all categories under semantic (or syntactic) . but in one csv file you can't determine each categorie's parent is syntactic or semantic.

sehsanm commented 5 years ago

Hi You are right, but I think if we create output as a CSV. Then each reseracher can do their job easily in Excel. I prefer to keep our format/code simple and leave the statistical process (e.g. grouping of the categories) to be done by reserecher in Excel as it is a very powerful tool for this purpose.