OdysseasKr / neural-disaggregator

Code for NILM experiments using Neural Networks. Uses Keras/Tensorflow and the NILMTK.
MIT License
138 stars 57 forks source link

Training multiple buildings #9

Open anasvaf opened 5 years ago

anasvaf commented 5 years ago

Hello Odyssea,

I am trying to replicate the results based on the Kelly et al. paper. Using your code I changed the train building to the following:

train.set_window(start="13-4-2013", end="1-1-2014")
test.set_window(start="1-1-2014", end="30-3-2014")

train_elec = []
test_building = 5
sample_period = 6
meter_key = 'kettle'

# Lists for train_elec, train_meter and train_mains for multiple buildings
train_elec = [train.buildings[i].elec for i in range(1,5)]
train_meter = [train_elec[j].submeters()[meter_key] for j in range(len(train_elec))]
train_mains = [train_elec[k].mains() for k in range(len(train_elec))]

# Test only in one house
test_elec = test.buildings[test_building].elec
test_mains = test_elec.mains()
rnn = RNNLSTMDissaggregate()

and then I call the train_across_buildings

start = time.time()
print("========== TRAIN ============")
epochs = 0
for i in range(3):
    print("CHECKPOINT {}".format(epochs))
    rnn.train_across_buildings(train_mains, train_meter, epochs=5, sample_period=sample_period)
    epochs += 5
    rnn.export_model("UKDALE-RNN-h{}-{}-{}epochs.h5".format(train_building,
                                                        meter_key,

I am creating lists for the 4 houses that I need during the training but I get the following error when I run the RNN for the function "train_across_buildings"

test

Could you give me some hint on how to change the code?

Best, Tasos

tisalvadores commented 5 years ago

Hi

Im having the same error, so i went to the train_across_buildings function and played a bit with it. I found that the error pointed to the same line (117) no matter what was there. At first forcing self.mmax = x and then even leaving a white line there or putting other lines of code there.

This lead me to think that the pandas error is not related with the function, or that part of the function, but for some reason it always points there. Im not certain of anything anyways :(

Let me know if the same happens to you!

anasvaf commented 5 years ago

Hello Tomas,

I have also tried to play with the train across buildings. Here is how I tried to "debug" the function

        if self.mmax == None:
            print(mainchunks)
            for m in mainchunks:
                print(len(m))
                input("wait")

If you look at the Series that are created, you will notice for house 3, if I am not mistaken, you get an empty dataframe, where you cannot calculate the maximum value. I believe that is the core of the error.

error

Let me know if you get something similar.

anasvaf commented 5 years ago

To be more specific, here is what you get when you calculate inside the for loop m.max() error_2

tisalvadores commented 5 years ago

Hmm Im using the REDD database, so i haven't checked, but maybe it's just that the third house of ukdale doesn't have a kettle meter. Check it and let me know, because i used to have that error and now i don't have it anymore and don't know why πŸ˜….

Nonetheless, that's not my case (i don't have an empty dataframe) and i'm still having problems.

I changed the code as u suggested to

if self.mmax == None:
            print(mainchunks)
            for m in mainchunks:
                print(len(m))
                input("wait")
            self.mmax = max([m.max() for m in mainchunks])

and training with three houses i got

23179
wait
31616
wait
56886
wait

For some reason my code is now running a bit past that, it entered into the train_across_buildings_chunk method and died in doing the random.shuffle(batch_indexes). I fixed that modifying the line above that from batch_indexes = range(min(num_of_batches)) to batch_indexes = list(range(min(num_of_batches))) But now the reshape X = np.reshape(mainpart, (batch_size, self.window_size, 1)) raises me this error:

Traceback (most recent call last):
  File "redd-test.py", line 39, in <module>
    disaggregator.train_across_buildings(train_mains, train_meters, epochs=1, sample_period=sample_period)
  File "/Users/TSV/Desktop/Progra/IPre/Server/seq2seq/2buildings/shortseq2pointdisaggregator.py", line 151, in train_across_buildings
    self.train_across_buildings_chunk(mainchunks, meterchunks, epochs, batch_size)
  File "/Users/TSV/Desktop/Progra/IPre/Server/seq2seq/2buildings/shortseq2pointdisaggregator.py", line 212, in train_across_buildings_chunk
    X = np.reshape(mainpart, (batch_size, self.window_size, 1))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 279, in reshape
    return _wrapfunc(a, 'reshape', newshape, order=order)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 51, in _wrapfunc
    return getattr(obj, method)(*args, **kwds)
ValueError: cannot reshape array of size 2100 into shape (42,100,1)
Closing remaining open files:/Users/TSV/Desktop/Progra/IPre/data/REDD/redd.h5...done/Users/TSV/Desktop/Progra/IPre/data/REDD/redd.h5...done

Let me know if you get here or if you don't get rid of the pandas error.

OdysseasKr commented 5 years ago

Hello everybody and sorry for the late reply. Which dataset are you using? For UKDALE, building 3 there is no data for the specified date range

train.set_window(start="13-4-2013", end="1-1-2014")

This may explain the empty dataframe.

tisalvadores commented 5 years ago

Hi Ody! First of all thanks a lot for sharing your work and maintaining support! It's been really helpful πŸ˜ƒ I'm using REDD and don't have the empty dataframe problem, but another error as i explained above. Does the train_across_buildings method as it is in the repo work fine for you?

anasvaf commented 5 years ago

Hello guys, I managed to fix the error by simply removing the train and test.set_window. I assume that the toolkit can identify the common dates within all the buildings. I am using the UK-DALE dataset for my experiments.

I print the batch 0 array for my 4 training houses and it is the following: Batch 0 of [1185228, 317437, 70848, 366821]

The error that I get know is that my data has to be 1-D. error

Any hints for this one?

anasvaf commented 5 years ago

I managed to fix it by changing the for loop inside the train_across_buildings_chunk as follows:

                # Create a batch out of data from all buildings
                for i in range(num_meters):
                    mainpart = mainchunks[i]
                    meterpart = meterchunks[i]
                    mainpart = mainpart[b*batch_size:(b+1)*batch_size]
                    meterpart = meterpart[b*batch_size:(b+1)*batch_size]
                    X = np.reshape(mainpart.values, (batch_size, 1, 1))
                    Y = np.reshape(meterpart.values, (batch_size, 1))

                    X_batch[i*batch_size:(i+1)*batch_size] = np.array(X)
                    Y_batch[i*batch_size:(i+1)*batch_size] = np.array(Y)

from the pandas Series, we needed only the values in order to reshape.

Tomas if you just change the X = np.reshape(mainpart, (batch_size, self.window_size, 1)) to X = np.reshape(mainpart.values, (batch_size, self.window_size, 1)) the code should work.

Now the script is iterating over the batches. I will let you know regarding the progress.

OdysseasKr commented 5 years ago

I managed to fix the error by simply removing the train and test.set_window. I assume that the toolkit can identify the common dates within all the buildings. I am using the UK-DALE dataset for my experiments.

I am not sure whether the toolkit detects common sections within building. Also by removing the limit, you are now getting all of the data available for each building.

OdysseasKr commented 5 years ago

@TomasSalvadores Your problem seems different than the one mentioned by the OP, please open a new issue and describe your problem in order to be able to discuss it.

anasvaf commented 5 years ago

I managed to fix the error by simply removing the train and test.set_window. I assume that the toolkit can identify the common dates within all the buildings. I am using the UK-DALE dataset for my experiments.

I am not sure whether the toolkit detects common sections within building. Also by removing the limit, you are now getting all of the data available for each building.

You are right. I am just using all of the data available for each building. Sorry for any confusion :)

maechler commented 5 years ago

@anasvaf Thanks, you saved my day! Although I think there is an error in your code, I had to change Y = np.reshape(meterpart.values, (batch_size, 1)) to
Y = np.reshape(meterpart.values, (batch_size, 1, 1)) in order to make it work.

bundit786 commented 4 years ago

Hi @OdysseasKr, @anasvaf, @maechler, @TomasSalvadores I have experimented using dae.train_across_buildings and got the error as well. I tested with UK-DALE dataset and tried to learn fridge model from House 1 and House 2. I tried fixing follow the recommendations by both @anasvaf and @maechler but they still didn't work. Below was the error (the same for both methods).

========== TRAIN ============ CHECKPOINT 0 0 Batch 0 of [25, 25]


AttributeError Traceback (most recent call last) ~\Anaconda3\envs\nilmtk-env\lib\site-packages\numpy\core\fromnumeric.py in _wrapfunc(obj, method, *args, *kwds) 55 try: ---> 56 return getattr(obj, method)(args, **kwds) 57

~\Anaconda3\envs\nilmtk-env\lib\site-packages\pandas\core\generic.py in getattr(self, name) 5066 return self[name] -> 5067 return object.getattribute(self, name) 5068

AttributeError: 'Series' object has no attribute 'reshape'

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)

in 39 print("CHECKPOINT {}".format(epochs)) 40 ---> 41 dae.train_across_buildings(train_mains, train_meter, epochs=5, sample_period=sample_period) 42 43 epochs += 5 ~\daedisaggregator.py in train_across_buildings(self, mainlist, meterlist, epochs, batch_size, **load_kwargs) 140 meterchunks = [self._normalize(m, self.mmax) for m in meterchunks] 141 --> 142 self.train_across_buildings_chunk(mainchunks, meterchunks, epochs, batch_size) 143 try: 144 for i in range(num_meters): ~\daedisaggregator.py in train_across_buildings_chunk(self, mainchunks, meterchunks, epochs, batch_size) 181 mainpart = mainpart[b*batch_size:(b+1)*batch_size] 182 meterpart = meterpart[b*batch_size:(b+1)*batch_size] --> 183 X = np.reshape(mainpart.values, (batch_size, self.window_size, 1)) 184 Y = np.reshape(meterpart.values, (batch_size, 1)) 185 ~\Anaconda3\envs\nilmtk-env\lib\site-packages\numpy\core\fromnumeric.py in reshape(a, newshape, order) 290 [5, 6]]) 291 """ --> 292 return _wrapfunc(a, 'reshape', newshape, order=order) 293 294 ~\Anaconda3\envs\nilmtk-env\lib\site-packages\numpy\core\fromnumeric.py in _wrapfunc(obj, method, *args, **kwds) 64 # a downstream library like 'pandas'. 65 except (AttributeError, TypeError): ---> 66 return _wrapit(obj, method, *args, **kwds) 67 68 ~\Anaconda3\envs\nilmtk-env\lib\site-packages\numpy\core\fromnumeric.py in _wrapit(obj, method, *args, **kwds) 48 if not isinstance(result, mu.ndarray): 49 result = asarray(result) ---> 50 result = wrap(result) 51 return result 52 ~\Anaconda3\envs\nilmtk-env\lib\site-packages\pandas\core\series.py in __array_wrap__(self, result, context) 733 """ 734 return self._constructor(result, index=self.index, --> 735 copy=False).__finalize__(self) 736 737 def __array_prepare__(self, result, context=None): ~\Anaconda3\envs\nilmtk-env\lib\site-packages\pandas\core\series.py in __init__(self, data, index, dtype, name, copy, fastpath) 247 'Length of passed values is {val}, ' 248 'index implies {ind}' --> 249 .format(val=len(data), ind=len(index))) 250 except TypeError: 251 pass ValueError: Length of passed values is 64, index implies 32768 Please you guys suggest how to fix the problem. Best, Bundit
pa8anas commented 1 year ago

Hi Ody! First of all thanks a lot for sharing your work and maintaining support! It's been really helpful πŸ˜ƒ I'm using REDD and don't have the empty dataframe problem, but another error as i explained above. Does the train_across_buildings method as it is in the repo work fine for you?

where can i download redd dataset?