the-aerospace-corporation / brainblocks

Practical Tool for Building ML Applications with HTM-Like Algorithms
GNU Affero General Public License v3.0
61 stars 13 forks source link

Sequence learner Load error #8

Closed vaibhavch closed 3 years ago

vaibhavch commented 3 years ago
    b0 = BlankBlock(num_s=hgt.num_bits)
    sl = SequenceLearner(num_spc=10, num_dps=10,num_rpd=10, d_thresh=6, perm_thr=1, perm_inc=1, perm_dec=0) 

    sl.input.add_child(b0.output)
    #after some computations
    sl.save(file_str='mysequence.bin')

    sl.load(file_str='mysequence.bin')
    print(sl.get_historical_count())

when I load saved file, error happens.

Error: 0

vaibhavch commented 3 years ago

it seems there is an issue loading next available coincidence detector if (fread(sl->s_next_d, sl->num_s * sizeof(uint32_t), 1, fptr) == 0) { printf("Error:\n"); // TODO }

ddigiorg commented 3 years ago

I'm getting the same error. Looking into it now.

vaibhavch commented 3 years ago

@ddigiorg did you find the problem?

ddigiorg commented 3 years ago

Got it fixed and we're going through our process to update the repo. Stay tuned.

vaibhavch commented 3 years ago

@ddigiorg Good to hear. if the process takes time, can you show me quick fix for now?

ddigiorg commented 3 years ago

In src/bbcore/sequence_learner.c please update the save and load functions:

// =============================================================================
// Save
// =============================================================================
void sequence_learner_save(struct SequenceLearner* sl, const char* file) {
    FILE *fptr;

    // check if file can be opened
    if ((fptr = fopen(file,"wb")) == NULL) {
       printf("Error in sequence_learner_save(): cannot open file");
       exit(1);
    }

    // check if block has been initialized
    if (sl->init_flag == 0) {
        printf("Error in sequence_learner_save(): block not initialized\n");
    }

    struct CoincidenceSet* cs;

    // save hidden coincidence detector receptor addresses and permanences
    for (uint32_t d = 0; d < sl->num_d; d++) {
        cs = &sl->d_hidden[d];
        fwrite(cs->addrs, cs->num_r * sizeof(cs->addrs[0]), 1, fptr);
        fwrite(cs->perms, cs->num_r * sizeof(cs->perms[0]), 1, fptr);
    }

    // save output coincidence detector receptor addresses and permanences
    for (uint32_t d = 0; d < sl->num_d; d++) {
        cs = &sl->d_output[d];
        fwrite(cs->addrs, cs->num_r * sizeof(cs->addrs[0]), 1, fptr);
        fwrite(cs->perms, cs->num_r * sizeof(cs->perms[0]), 1, fptr);        
    }

    // save next available coincidence detector on each statelet
    fwrite(sl->s_next_d, sl->num_s * sizeof(sl->s_next_d[0]), 1, fptr);

    fclose(fptr);
}

// =============================================================================
// Load
// =============================================================================
void sequence_learner_load(struct SequenceLearner* sl, const char* file) {
    FILE *fptr;

    // check if file can be opened
    if ((fptr = fopen(file,"rb")) == NULL) {
       printf("Error in sequence_learner_load(): cannot open file\n");
       exit(1);
    }

    // check if block has been initialized
    if (sl->init_flag == 0) {
        printf("Error in sequence_learner_load(): block not initialized\n");
    }

    struct CoincidenceSet* cs;

    // load hidden coincidence detector receptor addresses and permanences
    for (uint32_t d = 0; d < sl->num_d; d++) {
        cs = &sl->d_hidden[d];
        fread(cs->addrs, cs->num_r * sizeof(cs->addrs[0]), 1, fptr);
        fread(cs->perms, cs->num_r * sizeof(cs->perms[0]), 1, fptr);
    }

    // load output coincidence detector receptor addresses and permanences
    for (uint32_t d = 0; d < sl->num_d; d++) {
        cs = &sl->d_output[d];
        fread(cs->addrs, cs->num_r * sizeof(cs->addrs[0]), 1, fptr);
        fread(cs->perms, cs->num_r * sizeof(cs->perms[0]), 1, fptr);        
    }

    // load next available coincidence detector on each statelet
    fread(sl->s_next_d, sl->num_s * sizeof(sl->s_next_d[0]), 1, fptr);

    fclose(fptr); 
}
vaibhavch commented 3 years ago

now I need to do sl.initialize() before sl.load()to make it work . it is weird that anomaly score is different every time with same reset sequence though learning disabled sl.compute(learn=False)

ddigiorg commented 3 years ago

I will update the source code to automatically initialize the block when sl.load() is called. Can you share the code in your script? I'd like to help debug it or fix any BrainBlocks issues if necessary.

vaibhavch commented 3 years ago

train.py

hgt = HyperGridTransform(num_grids=100, num_bins=32, num_subspace_dims=1)
b0 = BlankBlock(num_s=hgt.num_bits)
sl = SequenceLearner(num_spc=10, num_dps=10, num_rpd=10, d_thresh=6, perm_thr=1, perm_inc=1, perm_dec=0) 
sl.input.add_child(b0.output)
df = prepare_data() # this function returns a pandas data frame consisting 30 columns
idf =  idf.drop('signal', 1) # this column is a label so we drop it for input

for index in range(10,len(df)-10):
   edf = pd.DataFrame() 
   if int(df['signal'].iloc[index]) == 1:
      edf = edf.append(idf[index-10:index]) # collecting 10 row sequences based on signal column
   else:
    continue

   X = edf.to_numpy()
   hgt.fit(X)
   X_bits = hgt.transform(X)

   for k in range(len(X_bits)):
      X_array = X_bits[k, :].flatten()
      b0.output.bits = X_array
      sl.compute(learn=True)
   b0.clear()
   sl.clear()
sl.save(file_str='sl.bin')

test.py

hgt = HyperGridTransform(num_grids=100, num_bins=32, num_subspace_dims=1)
b0 = BlankBlock(num_s=hgt.num_bits)
sl = SequenceLearner(num_spc=10, num_dps=10, num_rpd=10, d_thresh=6, perm_thr=1, perm_inc=1, perm_dec=0) 
sl.input.add_child(b0.output)
sl.initialize()
sl.load(file_str='sl.bin')

edf = df[17970:17980]  # this is one of the sequence used in training
print(edf) 
X = edf.to_numpy()
X_bits = hgt.transform(X)

for k in range(len(X_bits)):
  X_array = X_bits[k, :].flatten()
  b0.output.bits = X_array
  sl.compute(learn=False)
  if k == len(X_bits) - 1:
    score = sl.get_score()   #predicting score on last row
    print(score)

if we run test.py, anomaly score is different every time. same sequence, different anomaly every run

jacobeverist commented 3 years ago

At least one source of your difficulty would be the HyperGridTransform being newly created for each script. Currently we don't have a feature to save its configuration. So whenever you create one, it's completely different.

One thing you can do is pass in a seed for the random number generator. For the constructor, add the following keyword argument:

random_state=0

This ensures that the randomly created configuration is the same for both train and test scripts.

Also, make sure you "fit()" the HGT on the same data for each case. That way you know it is configured identically.

jacobeverist commented 3 years ago

@vaibhavch Did making the recommended changes to the HGT result in more consistent performance?

vaibhavch commented 3 years ago

Thanks, passing the seed solved the issue.