Closed wingz1 closed 4 years ago
I have a public branch https://github.com/schmidek/dygiepp/tree/multitask with some changes needed to train DYGIE++ on KnowledgeNet, in particular KnowledgeNet is not exhaustively annotated for all predicates on all sentences. The data format used is the same as https://github.com/dwadden/dygiepp/blob/master/doc/data.md#data-format with the addition of one field annotatedPredicates
which is just a list of which predicates were annotated for each sentence. Unfortunately I can't easily share the code that I used to convert the dataset at this time, as it has some internal dependencies.
Thanks for the reply. Given that you are not easily able to share the conversion code, are you able to share the reformatted KnowledgeNet dataset? (ie. The train.json (and perhaps dev.json or test.json) files). I'd like to try training KnowledgeNet with DyGIE++.
Hi, is the reformatted KnowledgeNet dataset in DYGIE++ format available for sharing? If so, where? Thanks!
Sure, here's the dataset
Thanks!
I see that only the dev.json file has anything in it. Train.json is empty. (Maybe it's still uploading?)
Fixed
How did you convert the knowledge-net dataset into the format for DYGIE++ to ingest for training and testing to get the scores you quote?