idio / wiki2vec

Generating Vectors for DBpedia Entities via Word2Vec and Wikipedia Dumps. Questions? https://gitter.im/idio-opensource/Lobby
601 stars 137 forks source link

Whether "dbc:Earthquakes" or "dbr:Vanilla_Ice" available in English Wikipedia (Feb 2015) 1000 dimension - No stemming - 10skipgram #19

Open zhq2009 opened 8 years ago

zhq2009 commented 8 years ago

Hello,

We are currently using the dataset "English Wikipedia (Feb 2015) 1000 dimension - No stemming - 10skipgram". We are search for some test cases on http://dbpedia.org/page/Earthquake. We have tried "DBPEDIA_ID/Vanilla_Ice" is available in the dataset. But when we try "dbc:Earthquakes" or "dbr:Vanilla_Ice", we will get error message KeyError "dbc:Earthquakes" not in vocabulary and "dbr:Vanilla_Ice" not in vocabulary. We are wonder whether the dataset stores data as "dbc:" or "dbo:"?

Thank you

dav009 commented 8 years ago

Hi @zhq2009,

I did not tag categories, so at the moemnt we dont have any dbc: nor dbo:. If this were to be included then the categories should be tagged on the text everytime an item of the category occurs?

dav009 commented 8 years ago

If I understand correctly, you want vectors for categories as well ? Category vectors could be easily done by appending the categories of a mention.

dbos sound a bit more complicated though

zhq2009 commented 8 years ago

Hello David,

Yes, we are looking for the vector categories as well.

Thank you for your help.

Sincerely,

Hanqing

On Mon, Jun 20, 2016 at 12:10 PM, David Przybilla notifications@github.com wrote:

If I understand correctly, you want vectors for categories as well ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/idio/wiki2vec/issues/19#issuecomment-227189485, or mute the thread https://github.com/notifications/unsubscribe/ARKSjGQQIahpjDXDHaOzwk57ezpyir_uks5qNrt1gaJpZM4I5NQl .

dav009 commented 8 years ago

if you provide :

I would gladly add the changes for supporting the gen of such vectors

zhq2009 commented 8 years ago

Sure, We can discuss later to decide how to map and load.

We are currently wondering why some entities (dbr) are not available in the DBpedia vectors? If you could provide some suggestions, that would be great.

Thank you.

dav009 commented 8 years ago

some reasons: