tetherless-world / mowgli-etl

DARPA Machine Common Sense (MCS) Multi-modal Open World Grounded Learning and Inference (MOWGLI) Extract-Transform-Load sub-project
MIT License
6 stars 1 forks source link

WebChild: create a spreadsheet randomly sampling some WebChild triples for humans to empirically judge triple quality #79

Closed gordom6 closed 4 years ago

gordom6 commented 4 years ago

The scores from WebChild shouldn't be included in the spreadsheet, in order to avoid biasing the humans.

To make the task easier, the annotation should either be on an order-of-magnitude-scale, either -1 (low quality), 0 (neutral), 1 (high quality) or a category dropdown with 5 values, the middle neutral.

gordom6 commented 4 years ago

Second part is to see if the human annotations are correlated with the scores in WebChild.

gordom6 commented 4 years ago

Sam identified two caveats: (1) some of the words are obscure, but WebChild links them to WordNet synsets + has a definitions file for the words. The spreadsheet should include (subject word, subject word definition, relation, object word, object word definition) in addition to the annotation fields (2) Many of the triples are not commonsense. The annotator should identify ones that are not common sense by not annotating them. (3) There are many more "physical" relations than other kinds. We may want to sample the sets of relations of different types separately.

gordom6 commented 4 years ago

The basic approach:

The annotation column should be 1-5. Blank = not common sense 1 = definitely wrong 2 = probably wrong / right most of the time 3 = could be right or wrong depending on circumstances (neutral) 4 = probably right / right most of the time 5 = always right The annotator should not do any research, should rely on snap judgments.

gordom6 commented 4 years ago

@dfamilia33 You can put a script to do all of this in the root of the repository. It can reuse mowgli.* code. Download the ETL'd WebChild data from CircleCI artifacts or run the web_child ETL pipeline from the command line. Put it in the data/ directory and assume it's there when others run your script.

gordom6 commented 4 years ago

Conclusions were that the WebChild confidence scores are not well correlated with human judgment.