Closed RicardoUsbeck closed 10 years ago
Wrapper implemented. Experiments can be performed both on Training and Test set. Please check before closing the issue.
Beware that the expected dataset looks like: tweet_id \t tweet_text \t list(pair)
where: pair=entity_mention \t dbpedia_uri each pair is separated by another by \t
Example: 91649478326624256 "When it hurts to look back, and you're afraid to look ahead, you can look beside you and a #Leo will be there - #LeoFriendship" Leo http://dbpedia.org/resource/Leo_(astrology)
Hence, you can first get the GS [1] and per each tweet_id you have to download the corresponding text.
[1] - http://www.scc.lancs.ac.uk/microposts2014/challenge/dataset/microposts2014-neel_challenge_gs.zip
Write a wrapper for the Microposts2014 dataset. Annotate the license, experiment type and language. Give provenance. Update https://github.com/AKSW/gerbil/wiki/Licences-for-datasets