tetherless-world / mowgli-etl

DARPA Machine Common Sense (MCS) Multi-modal Open World Grounded Learning and Inference (MOWGLI) Extract-Transform-Load sub-project
MIT License
6 stars 1 forks source link

Adapt portal_benchmark pipeline to Henrique's schema format #157

Closed gordom6 closed 3 years ago

gordom6 commented 4 years ago

Use Jason's repository: https://github.com/gychant/CommonsenseBenchmark

Leave the existing transformer code in until it's redundant with the data in Jason's repo. That's KagNet+CommonsenseQA Benchmark+BenchmarkSubmission+their trees.

Translate: test_data/benchmarks.json (uses our Benchmark, BenchmarkDataset, BenchmarkQuestion) test_data/samples.json (use BenchmarkSubmission, BenchmarkAnswer) data/* -- a mix of BenchmarkQuestion, BenchmarkSubmission, BenchmarkAnswer

in that order.

May need to adapt models, which will have to be reflected in Scala (and possibly TypeScript). Discuss major structural changes with me first, please.

gordom6 commented 4 years ago

Jason has changed /converted/ to use the new JSON-LD format, so you can concentrate on that format.