the-aerospace-corporation / brainblocks

Practical Tool for Building ML Applications with HTM-Like Algorithms
GNU Affero General Public License v3.0
61 stars 13 forks source link

reset functionality #4

Closed dee-corr closed 3 years ago

dee-corr commented 3 years ago

There is no reset to signal the start of a new sequence.

Workaround suggested by @jacobeverist, "add a reset code, kind of like a newline or EOF character. This would indicate the end and start of a new signal and would nicely break up your sequences and prevent them from stitching together".

dee-corr commented 3 years ago

Hi @jacobeverist , I assumed you meant append a line of newline chars to my data input to break up the sequences, but I received a "ValueError: Input contains NaN, infinity or a value too large for dtype('float64')" from line "hgt.fit(X)". Have tried other non numeric chars to no avail. Was I meant to add this reset code further down somehow?

jacobeverist commented 3 years ago

@dee-corr I actually didn't mean the literal newline char, but something that could fill that role like an 'a', 0 or -1. Something that is processed by the encoder you are using that will create a unique encoding not created by any of the other data you have.

Can you pose an example of your data and where you want to the reset to occur. Also, post the code that you are using to process it, and I'll show you what you need to do.

dee-corr commented 3 years ago

Hi @jacobeverist , thanks for the quick reply!

My data looks like this (the 16 channels): 67.643,64.689,52.537,41.459,-1.376,-71.94,-65.763,24.271,39.646,26.218,-9.769,-70.06,-34.073,5.405,-25.01,-47.837 65.629,64.689,53.544,48.441,-14.402,-75.901,-62.809,27.292,34.678,21.183,-1.779,-67.039,-33.066,7.419,-20.041,-38.84 64.689,66.703,48.576,54.484,-19.37,-80.937,-55.76,20.243,25.614,14.2,5.136,-64.085,-30.112,12.455,-22.995,-36.826 57.639,67.71,49.516,60.459,-23.398,-84.898,-51.798,14.267,17.624,8.225,17.154,-60.056,-33.066,19.437,-24.002,-35.819 48.643,74.693,46.562,65.428,-19.37,-82.951,-50.791,10.306,10.642,11.179,25.144,-63.078,-43.07,20.444,-17.02,-31.791 ......... ......... .........

I want to add a reset between files. Here's what I have so far (with just a really small amount of data to get started):

 from brainblocks.blocks import BlankBlock, SequenceLearner
 from brainblocks.tools import HyperGridTransform
 from brainblocks.datasets.time_series import make_sample_times, generate_multi_square, generate_sine
 import numpy as np
 import pandas as pd
 import matplotlib.pyplot as plt

 output_name = "multivariate_output"

 num_params = 16

 secs = 600
 sample_rate = 400

 param_values = []
 param_names = []

 param_dict = {}

 neg_set = np.genfromtxt("./data/testing/Pat1Train_1_0_sample.csv", delimiter=',')
 neg_set = np.concatenate((neg_set, np.genfromtxt("./data/testing/Pat1Train_2_0_sample.csv", delimiter=',')))
 pos_set = np.genfromtxt("./data/testing/Pat1Train_3_1_sample.csv", delimiter=',')

 for k in range(0, num_params-1):

     values0 = neg_set[:,k].tolist()
     values1 = pos_set[:,k].tolist()

     values = np.concatenate((values0, values1))

     param_dict["value_%d" % k] = values

 sigDF = pd.DataFrame(data=param_dict)
 X = sigDF.to_numpy()

 hgt = HyperGridTransform(num_grids=8, num_bins=8, num_subspace_dims=1)

 hgt.fit(X)
 X_bits = hgt.transform(X)

 b0 = BlankBlock(num_s=hgt.num_bits)

 sl = SequenceLearner(num_spc=10, num_dps=10, num_rpd=12, d_thresh=6)

 sl.input.add_child(b0.output)

 scores = []
 for k in range(len(X_bits)):
     X_array = X_bits[k, :].flatten()

     b0.output.bits = X_array

     # learn the sequence
     #if k < 13000:
     sl.compute(learn=True)
     #else:
     #sl.compute(learn=False)

     score = sl.get_score()

     scores.append(score)

 sigDF["score"] = scores

 print("Saving to " + output_name + ".csv")
 sigDF.to_csv(output_name + ".csv", index=True)

 print("Saving to " + output_name + ".png")
 axes = sigDF.plot(subplots=True, legend=False)
 for k in range(len(sigDF.columns)):
     axes[k].set_ylabel(sigDF.columns[k])
 plt.savefig(output_name + ".png")
 plt.close()
jacobeverist commented 3 years ago

Try this. This just shows you how to create the blocks, connect them together, and set their data. You will have to organize the code yourself.


# this is where your HGT data goes
b0 = BlankBlock(num_s=hgt.num_bits)

# this is for your reset signal
# the size as the number of active bits from the HGT
# this ensures the same number of bits being received by the sequence learner while its being reset
b1 = BlankBlock(num_s=hgt.num_act_bits)

# sequence learner block
sl = SequenceLearner(num_spc=10, num_dps=10, num_rpd=12, d_thresh=6)

# accept your transformed data
sl.input.add_child(b0.output)

# also accept the reset signal
sl.input.add_child(b1.output)

# create the static signals for the reset block and input block
off_reset_signal = np.zeros(hgt.num_act_bits, dtype=np.bool)
on_reset_signal = np.ones(hgt.num_act_bits, dtype=np.bool)
clear_input_signal = np.zeros(hgt.num_bits, dtype=np.bool)

# INGEST CODE HERE
# LOOP THROUGH DATA TIME STEPS
# TRANSFORM DATA WITH HGT

# now when adding normal data in a sequence, set the blank blocks as follows:
if isInSequence:
  b0.output.bits = X_array
  b1.output.bits= = off_reset_signal

# and when terminating a sequence and starting a new one, set the blank blocks as follow:
else:
  b0.output.bits = clear_input_signal
  b1.output.bits= = on_reset_signal

# execute the sequence learner as normal
sl.compute(learn=True)

# get abnormality score
score = sl.get_score()
dee-corr commented 3 years ago

Thank you!! I'll let you know how it goes

jacobeverist commented 3 years ago

Fixed with https://github.com/the-aerospace-corporation/brainblocks/commit/627b8788b282bef59010cb9625c008c8349a9aff

Demonstrated .clear() method in the example script below: https://github.com/the-aerospace-corporation/brainblocks/blob/master/examples/python/abnormality_detection/abnorm_blocks_reset_sequence.py