pengowen123 / cge

An implementation of the CGE neural network encoding, written in Rust.
Apache License 2.0
3 stars 2 forks source link

Redesign `Gene` and `Network` types #4

Closed pengowen123 closed 2 years ago

pengowen123 commented 2 years ago

This redesigns the Gene and Network types to be cleaner and more idiomatic. It also improves the performance of network evaluation by caching subnetwork results.

cc @wbrickner How do these changes look? It will require you to change any parts of your code that handle Gene, but the new design should be cleaner overall. Some other things to note:

pengowen123 commented 2 years ago

After this PR is merged, I'll add the mutation operators, as well more extensive tests that you can duplicate for const_cge.

wbrickner commented 2 years ago

I can provide comments if you'd like in a little bit.

Let me examine the info about input output dimensionality etc, I need to think about this carefully. I think what I'm doing in const_cge is correct already (count unique IDs of input genes), but maybe not correct. Also, for the input index I believe I mimicked the original codebase in using the ID of the input gene.

wbrickner commented 2 years ago

Comments

Correctness:

API

Performance + Scaling

/// Represents which transfer function to use for evaluating neural networks.
#[derive(Debug, Copy, Clone, Eq, PartialEq)]
pub enum Activation {
    /// Maps input to output directly, as if there is no transfer function.
    Linear,
    /// Outputs 1 if input is greater than 0, 0 otherwise.
    Threshold,
    /// Outputs 1 if input is greater than 0, 0 if input is equal to 0, -1 otherwise. Useful
    /// for simple problems and boolean logic, as it only allows three possible output values.
    Sign,
    /// A non-linear function. This function is the most general, so it should be defaulted to.
    Sigmoid,
    Tanh,
    SoftSign,
    BentIdentity,
    Relu,
}

impl Activation {
    /// Apply this activation function
    pub fn apply(&self, x: f64) -> f64 {
        match self {
            Activation::Linear       => Self::linear(x),
            Activation::Threshold    => Self::threshold(x),
            Activation::Sign         => Self::sign(x),
            Activation::Sigmoid      => Self::sigmoid(x),
            Activation::Tanh         => Self::tanh(x),
            Activation::SoftSign     => Self::soft_sign(x),
            Activation::BentIdentity => Self::bent_identity(x),
            Activation::Relu         => Self::relu(x),
        }
    }

    fn linear(x: f64)        -> f64 { x }
    fn threshold(x: f64)     -> f64 { if x > 0.0 { 1.0 } else { 0.0 } }
    fn sign(x: f64)          -> f64 { if x > 0.0 { 1.0 } else if x == 0.0 { 0.0 } else { -1.0 } }
    fn sigmoid(x: f64)       -> f64 { 1.0 / (1.0 + (-x).exp()) }
    fn tanh(x: f64)          -> f64 { x.tanh() }
    fn soft_sign(x: f64)     -> f64 { x / (1.0 + x.abs()) }
    fn bent_identity(x: f64) -> f64 { (((x.powi(2) + 1.0).sqrt() - 1.0) / 2.0) + x }
    fn relu(x: f64)          -> f64 { if x > 0.0 { x } else { 0.0 } }

// ...

By the way, is there some issue with representing Activation indices as u8? I see they're often i32.

Great PR, hopefully some of these perspectives are useful! :)

pengowen123 commented 2 years ago
* I would love to see the serialization work be pushed onto serde (or something) for some better correctness guarantees using a macro/declarative serialization format (with protocol versioning ideally).  Not critical now, looks to be pretty good, it's just cumbersome and a little error prone (and brittle) to be doing string manipulation manually.

This is the current plan. It would be nice to retain something reasonably human-writable, but I think serde should work pretty well here.

* is `Network::validate()` used anywhere? It yields `Ok(())` deterministically.

This function was removed and worked into the rebuild_network_metadata function in a later commit.

* it'd be really cool if we could apply something along the lines of

  * `Network::save_to_file(&self, path: &str)` => `Network::to_file::<P: Into<Path>>(&self, path: P)`
  * `Network::load_from_file(&self, path: &str)` => `Network::from_file::<P: Into<Path>>(&self, path: P)`

I like this better too, I'll change it when serialization/deserialization is reworked.

* I wonder if we can perform the same tasks as `rebuild_neuron_info` in O(1) each time updates are applied? Is the scope of possible changes to the hashmap knowable a priori? May not be worth it if it's not obvious with your domain knowledge.

The best we can do here is O(n) on the number of neurons in the genome, as the entire neuron_info map must be scanned and updated. It would still be much less work than a full rebuild though, so I'll do this where possible with the mutation operators.

* **Question**: Is the `NeuronInfo` struct applicable to deduplication? Can we rely on these things to build something like topological fingerprints, capturing depth and subgenome range?  One of my desires long term, although it is likely to be overopt at this stage.

That's a good idea! The neuron_info map does seem to capture all structural information except for connections, so it would be well suited to determining structural similarity in eant2. Structural identicality should be easy to check as well.

* I think the function pointer approach to activation functions is not the best we can do, I think it'd be easier to understand + probably more performant on modern CPUs to replace `value = activation.get_func()(sum_inputs)` with `value = activation.apply(sum_inputs)`. I propose a new implementation of `Activation` that is a little shorter and declares everything a bit more statically:
/// Represents which transfer function to use for evaluating neural networks.
#[derive(Debug, Copy, Clone, Eq, PartialEq)]
pub enum Activation {
    /// Maps input to output directly, as if there is no transfer function.
    Linear,
    /// Outputs 1 if input is greater than 0, 0 otherwise.
    Threshold,
    /// Outputs 1 if input is greater than 0, 0 if input is equal to 0, -1 otherwise. Useful
    /// for simple problems and boolean logic, as it only allows three possible output values.
    Sign,
    /// A non-linear function. This function is the most general, so it should be defaulted to.
    Sigmoid,
    Tanh,
    SoftSign,
    BentIdentity,
    Relu,
}

impl Activation {
    /// Apply this activation function
    pub fn apply(&self, x: f64) -> f64 {
        match self {
            Activation::Linear       => Self::linear(x),
            Activation::Threshold    => Self::threshold(x),
            Activation::Sign         => Self::sign(x),
            Activation::Sigmoid      => Self::sigmoid(x),
            Activation::Tanh         => Self::tanh(x),
            Activation::SoftSign     => Self::soft_sign(x),
            Activation::BentIdentity => Self::bent_identity(x),
            Activation::Relu         => Self::relu(x),
        }
    }

    fn linear(x: f64)        -> f64 { x }
    fn threshold(x: f64)     -> f64 { if x > 0.0 { 1.0 } else { 0.0 } }
    fn sign(x: f64)          -> f64 { if x > 0.0 { 1.0 } else if x == 0.0 { 0.0 } else { -1.0 } }
    fn sigmoid(x: f64)       -> f64 { 1.0 / (1.0 + (-x).exp()) }
    fn tanh(x: f64)          -> f64 { x.tanh() }
    fn soft_sign(x: f64)     -> f64 { x / (1.0 + x.abs()) }
    fn bent_identity(x: f64) -> f64 { (((x.powi(2) + 1.0).sqrt() - 1.0) / 2.0) + x }
    fn relu(x: f64)          -> f64 { if x > 0.0 { x } else { 0.0 } }

// ...

Makes sense to me.

By the way, is there some issue with representing Activation indices as u8? I see they're often i32.

u8 would work as well, but I think the integer conversion will be unnecessary with serde anyways.

Great PR, hopefully some of these perspectives are useful! :)

I appreciate the feedback, thanks a lot! I'll go ahead and merge this then and implement the rest of the changes in future commits.