isi-vista / adam

Abduction to Demonstrate an Articulate Machine
MIT License
11 stars 3 forks source link

Unknown objects rehearsal experiment #1157

Open spigo900 opened 2 years ago

spigo900 commented 2 years ago

We're interested in checking what the object learner does when it decodes unknown objects. We want to run a "rehearsal" experiment here.

As part of this task, you'll need to:

  1. Run the unknown objects script on the M5 objects train curriculum, removing the list of objects from earlier (from your look at object contrasts and the discussion). That is:
    1. apple
    2. ball
    3. chair
    4. cube block
    5. window
  2. Train an object GNN module (#1110 / #1151) on the rehearsal curriculum.
  3. Evaluate that GNN on train and on test. It's worth recording train accuracy for completeness though it doesn't tell us much. The main reason to run inference on train is to produce the decode files ADAM needs (the feature.yaml s). Also record the test accuracy. I'd expect this to be very similar to ADAM's test accuracy.
  4. Run ADAM with the subset learner over the resulting curriculum / decode. Parameters would be similar to the M5 objects curriculum except for the train curriculum.
  5. Since it's ~easy to do (we already have observers set up to do it), it's probably worth collecting/comparing per-object accuracy and qualitative outcomes results vs. the m5_objects_v0_with_mugs baseline results.
  6. The main interesting thing to look at here are the per-object results. I think this means:
    1. A table of ADAM's train vs. test accuracies in the baseline (m5_objects_v0_with_mugs) case and for "unknown objects." (Mostly for completeness.)
    2. The confusion matrices for ADAM's output on train and on test. (details on confusion matrices: see #1156.)
    3. Maybe separate matrices for the GNN?
    4. Separately, a plot of only those rows of the test-time confusion matrix (or matrices) that shows only the objects removed from train (the "unknown objects").
    5. As with #1156, remember to avoid red/green contrasts in plots due to colorblindness.
    6. A writeup containing these figures plus discussion focusing on the unknown objects outcomes.
      1. e.g. we expect the GNN to confuse unknown objects with ones it saw during training, how weird vs. "reasonable" are the confusions it makes. Is it confusing apple/ball with orange, or with book? etc.
      2. This probably requires manually looking at the stroke and stroke graph images for the unknown objects in the test set.