originrose / cortex

Machine learning in Clojure
Eclipse Public License 1.0
1.27k stars 111 forks source link

API Question: Class Mapping and execute/run return value #182

Open hellonico opened 7 years ago

hellonico commented 7 years ago

Supposing I train a network with:

(classification/perform-experiment
  (initial-description image-size image-size num-classes)
                                 train-ds
                                 test-ds
                                 observation->image
                                 class-mapping
                                 {})

And do a guess on a new observation (after loading the nippy file), with

(def nippy
  (util/read-nippy-file "trained-network.nippy"))
(execute/run nippy (into-array (image-file->observation "cat.png")))

The return of execute/run is a number, and the result needs to be converted to the class again to find out the real guess. Here, if I get 0, I need to execute an extra function like:

(defn index->class-name[n]
  (nth ["cat" "dog"] n))

to get the expected value "cat" from the returned result of : 0.

But I thought that was the reason for the parameter class-mapping passed to initial-description during the training step ?

  (def class-mapping
    {:class-name->index (zipmap ["cat" "dog"] (range))
     :index->class-name (zipmap (range) ["cat" "dog"])})

Shouldn't the network in the nippy file serialize the class-mapping and be able to return the decoded value ?

harold commented 7 years ago

Hi, thanks for these interesting thoughts.

The class mapping could be re-used, note that (index->class-name 0) ;#> "cat" and ((:index->class-name class-mapping) 0) ;#> "cat" given what you've written.

I'm not so sure that the class-mapping should necessarily be part of the network, though associng it on would be easy enough (the saved network is just a Clojure map).

Perhaps it would be helpful for me to mention that execute/run is a lot more general than the classification example (execute/run is the main mechanism for inferring anything with a neural net in cortex).

I'm thinking now that perhaps what would be helpful is a classification-example specific function that composes inference with resolving the class name from the class mapping.

What do you think?

hellonico commented 7 years ago

Thank you for the fast comment

The example I am working on separates the training part of the network: https://github.com/hellonico/cortex/blob/master/examples/mnist-catsndogs/src/catsdogs/training.clj

from the network usage part (what I would say is the client part): https://github.com/hellonico/cortex/blob/master/examples/mnist-catsndogs/src/catsdogs/simple.clj

So ideally, from a network user point of view, the class-mapping would already be included in the network or automatically pulled in from a separate mapping file ? In the latter case, when doing:

(def nippy (util/read-nippy-file "trained-network.nippy"))

if there is a file named, say, trained-network.mappings, it could also load it and infer the mappings directly ? Those mappings would of course be optionally saved during the training phase.

Nico

harold commented 7 years ago

You are welcome.

In that case I think it makes the most sense for you to just assoc the class mappings (which should just be maps of integers and strings) onto the network itself (which, again, is just a Clojure map). Putting it in a separate file could definitely also be made to work, but isn't as parsimonious as just putting it in the network map.

I am not sure this warrants any changes to Cortex, but if you do any experiments, feel free to share them.

Hope that helps.

cnuernber commented 7 years ago

Formalizing the pathway for classification where we do in fact save the class mappings makes a lot of sense. We have had bugs before where the order of the class mappings changed or they were different than expected leading to an odd demo or two before.

This would in fact be a helpful thing moving forward; especially if it was robust which means it would work with multi-target networks.

I agree with Harold that the best way is just to assoc the mappings onto the built network perhaps under a :mappings tag and with the id of the output node as the key that gets you to the actual mappings map. Then we would need to build out that pathway in experiment/classification.clj and test it.