KeyError: 'val_acc' after first epoch - OSX

wilmorrtr commented 4 years ago

Solutions in this thread: Title issue Using a specific model from a checkpoint, not the initial base model (important) Using the prediction function Pointing the prediction batch mode to test validation data Copying random files to validation

Original post:

The training script is crashing out early on with the following output:

`Found 7372 images belonging to 2 classes. Found 0 images belonging to 0 classes. Epoch 1/1000 103/103 [==============================] - 266s 3s/step - loss: 210.3966 - accuracy: 0.7679

Traceback (most recent call last): File "train-binary.py", line 118, in validation_steps=nb_validation_samples//batch_size) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper return func(*args, kwargs) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/keras/engine/training.py", line 1732, in fit_generator initial_epoch=initial_epoch) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/keras/engine/training_generator.py", line 260, in fit_generator callbacks.on_epoch_end(epoch, epoch_logs) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/keras/callbacks/callbacks.py", line 152, in on_epoch_end callback.on_epoch_end(epoch, logs) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/keras/callbacks/callbacks.py", line 702, in on_epoch_end filepath = self.filepath.format(epoch=epoch + 1, logs) KeyError: 'val_acc'`

I'm not familiar enough with python to debug this properly and would appreciate any suggestions you might have

thanks!

lmcasam commented 4 years ago

I fixed this by putting in "val_accuracy" instead of just val_acc

wilmorrtr commented 4 years ago

I fixed this by putting in "val_accuracy" instead of just val_acc

I’m afraid that just changed the error message to:

KeyError: ‘val_accuracy

Still looking into it. I’ll update if I solve it!

lmcasam commented 4 years ago

They are 2 spots with val_acc

Also make sure you have pillow installed as well. I didn’t. Maybe that helped also

Regards, Alex

Sent from my iPhone

On Dec 18, 2019, at 5:23 PM, wilmorrtr notifications@github.com wrote:

I fixed this by putting in "val_accuracy" instead of just val_acc

I’m afraid that just changed the error message to:

KeyError: ‘val_accuracy

Still looking into it. I’ll update if I solve it!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

wilmorrtr commented 4 years ago

I did change it both locations, the path for the target dir and in the ModelCheckpoint.

I suspect it’s time to learn python. I just wish all this fun ML stuff was more C++ focused :P 20 years of development experience and python is a closed book to me.

wilmorrtr commented 4 years ago

I solved this (potentially - I am still figuring out what does what) by changing all instances of val_acc to accuracy based on it being a tracked metric. I might be wildly wrong - we’ll see after it’s trained up!

lmcasam commented 4 years ago

Great. Let me know if you can get the predict-binary script working.
I am stuck and don’t know how to use it.

On Dec 18, 2019, at 6:16 PM, wilmorrtr notifications@github.com wrote:

I solved this (potentially - I am still figuring out what does what) by changing all instances of val_acc to accuracy based on it being a tracked metric. I might be wildly wrong - we’ll see after it’s trained up!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

wilmorrtr commented 4 years ago

Will do! I have a pretty good idea of how to move forward. I’ll pop back once training is done. I haven’t set up my CUDA box yet so I’m still on tensorflow, but I’ll be moving this to good hardware in the next couple days

lmcasam commented 4 years ago

I did 10 epochs of training .. took about 25 min on a MacBook Pro laptop.

wilmorrtr commented 4 years ago

Each epoch is taking 361 seconds on my MBP. The roughly 9000 ways all these components can be set up means I probably have more than a few badly optimized bits. I’ll be setting up a ML tuned Ubuntu distro on my CUDA machine. It currently does Blender rendering :P

lmcasam commented 4 years ago

nice. but for the dev portion. (you can do -d option during training) it will only do 10 epochs. It should not take that long. Ended up with 98% accuracy anyways Don't think you need to run the full 1000 epochs to see if all works.

wilmorrtr commented 4 years ago

Okies fwiw I can help you with predict-binary.py

Copy a random (preferably unseen) image to a temp dir (like /tmp/sell.jpg) and add this line to the end of the script:

predict(‘/tmp/sell.jpg’)

Be sure to try it with both a buy and a sell. My first attempt yesterday seemed to work because it identified a sell as a sell, but it also identified a BUY as a sell while at .984. Then I used the predict function that takes a directory as input (no trailing slash, no buy / sell directories, just jpgs in a dir) and it called everything a sell.

BUT. In between training and prediction, I’d had to recompile tensorflow with different cpu flags, so I figured I’d need to train again. I did 10 epochs and it had a 50% real success rate.

Sorry for the delay getting back to you - it was one of those things where I wrote it all out but wanted to wait to submit until I had a solid instance running. I’m refraining it now, but I won’t make you wait :)

lmcasam commented 4 years ago

Thanks for getting back. I was able to figure it out in the end.
Also what I did was run the following to backtest all the validation data against the trained:

predict_files("../data/validation")

I got the following precision after running: True buy: 4675 True sell: 4669 False buy: 89 False sell: 73 no action: 52 Sell Precision: 0.9846056516237874 Buy Precision 0.9813182199832073 Recall: 0.9846251053074979 Precision: 0.9829581317062908 F-measure: 0.9846153783695529

During the training I achieved 98% accuracy. Which is very good!

I also tried like you suggested putting random pictures into the src directory and running predict and it works as well.

lmcasam commented 4 years ago

I forgot to tell you. Something I figure out but its not written!

after you run the train-binary script it will create a base model and then checkpoint models as you learn. However the predict script only loads the base model.h5 which is not trained at all. I therefore manually copied one of the generated models, with the highest accuracy, to model.h5 for the predict script to work properly using the trained data!

wilmorrtr commented 4 years ago

I’ve been an SVN guy forever and don’t really use github much. Is there a way to send private messages? I wanted to talk about how we’re going to generate the ongoing samples and a queueing system using redis, but I don’t want to take this thread off course.

You’ve helped a lot! Thanks so much for the details!

wilmorrtr commented 4 years ago

Another bit from me:

I wanted to get as random data as possible in validation, so I wrote this up to move random files:

ls | gshuf -n3106 | xargs -I _ mv _ ../../validation I got 3106 by just doing the 20% math elsewhere and sticking it in. That wasn’t the annoying part.

You might need to install gshuf which is an OS X version of plain old shuf.

lmcasam commented 4 years ago

nice.. I should try that. But keep in mind. what we are doing is reading images and determining whether they fit as a buy or sell, We already trained it on what each pic is, so we are just doing picture recognition. So randomizing the pictures in the validation should not make a difference and give same % accuracy results. (BTW I used png in my pics since it takes less space)

The more interesting thing to make it more predictive is changing the Graphwerk to be more predictive by adding more charts or metrics while its creating the graphs.

I checked and there is no way to send private msg in Github. That feature was removed.

wilmorrtr commented 4 years ago

I didn’t pay any attention to how the graph was created so I was worried there would be continuing trends between sequential images and just figured better safe than sorry

In any case, I’ll happily share the couple ideas I’ve come up with.

Change the color of the bar based on the RSI (or other indicator) from red to green based on its performance at the time.

Have a circle / dot on the image representing trading volume, and another for bid/ask spread.

I think that might make a reasonably significant difference.

If you do this based on me saying it, please do share some tips back. I’ve been programming in c and c++ for 20 years, but for some reason python is giving me heartburn. :P

lmcasam commented 4 years ago

:). some good ideas ! I will need to look at some more techniques and metrics to understand what is the best way. For example this is a 12 hour rolling window that it predicts on. However what happens if there are 2 buy in a row?. do you buy more or stay? What happens if there is a buy and then a sell. Do you get out of your position and have nothing in or do you sell double to compensate for the buy you already have in the market?

There are a lot of actual "trading" questions that need to be determined aside from the programming.

I am new to python as well. I have programmed in many languages but not for a while. Good things we are starting with an existing example!

wilmorrtr commented 4 years ago

All good questions. My intention is to create an API that will return the current state, the last x recommendations, etc. Then, I can implement it as a service to the existing trade management app I wrote.

It was initially designed for stocks, but can do anything I need it to. It keeps track of transactions and such, and I can put the logic in there for what to do - all without having to use python too much :P I will probably end up spinning up a docket running this once it’s golden.

Happy to share anything I come up with!

wilmorrtr commented 4 years ago

I rather suspect I’ve made some error somewhere.

Got it running fine in the end.

Here’s my end result against my validation data:

True buy: 3016 True sell: 2911 False buy: 55 False sell: 57 no action: 173 Sell Precision: 0.9807951482479784 Buy Precision 0.982090524259199 Recall: 0.9814513504718516 Precision: 0.9814538830932273 F-measure: 0.9811231396383895

I’m going to trash it all and do it again to make sure I’m comfortable, then I’ll set up a websocket based on its rolling output.

lmcasam commented 4 years ago

very good!

I am working on a backtesting python package now to be able to backtest these recommendations .

ashokballolli commented 4 years ago

I solved this (potentially - I am still figuring out what does what) by changing all instances of val_acc to accuracy based on it being a tracked metric. I might be wildly wrong - we’ll see after it’s trained up!

Hi @wilmorrtr , Could you please let me know how did you solve this issue? I am still struggling with the issue of KeyError: 'val_accuracy'. I tried replacing the val_acc to val_accuracy and the code line looks like this

target_dir = "./models/weights-improvement-{epoch:02d}-{val_accuracy:.2f}.hdf5"
if not os.path.exists(target_dir):
  os.mkdir(target_dir)
model.save('../src/models/model.h5')
model.save_weights('../src/models/weights.h5')

checkpoint = ModelCheckpoint(target_dir, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')

Thanks in advance

cderinbogaz commented 4 years ago

Hi @wilmorrtr and @lmcasam it is interesting to see what kind of interesting ideas you are sharing :) Would you be interested in contributing to the project?

lmcasam commented 4 years ago

Hi @cderinbogaz yes it would be cool to work together to see if we can come up with something more refined and interesting.

tackelua commented 4 years ago

@ashokballolli Did you run predict success?

cderinbogaz / inpredo

KeyError: 'val_acc' after first epoch - OSX #4