aqibsaeed / Genetic-CNN

CNN architecture exploration using Genetic Algorithm
Apache License 2.0
215 stars 98 forks source link

Same Accuracy #6

Closed DarkoNedic closed 5 years ago

DarkoNedic commented 5 years ago

Hello,

thank you for your work and for the publication of the code.

I tried your genetic algorithm but I ran into one problem. I uncommented the line to get accuracy values. #print('Accuracy: ',score)

While running the process with the states and num_nodes you set:

STAGES = np.array(["s1","s2","s3"]) # S
NUM_NODES = np.array([3,4,5]) # K

I get always an accuracy value of 0.101 without any improvements after generations.

When I do changes to this and set:

STAGES = np.array(["s1"]) # S
NUM_NODES = np.array([3]) # K

it seems to work sometimes more as expected because of different accuracies but still with very bad ones (e.g. 0.1946, 0.1018 etc. but again also some 0.101 values) Furthermore one stage seems a little bit small. I also played with the TRAINING_EPOCHS and BATCH_SIZE, but nothing really changed.

I tried it with the latest versions of the libraries, and also with the older versions of the libraries released before the 12th of March 2017 (your last commit), but there are no differences with the accuracy. pip freeze for the version with old libraries gives:

deap==1.0.1 numpy==1.16.4 protobuf==3.8.0 py-dag==2.5.0 scipy==0.19.0 six==1.12.0 tensorflow==1.0.1

Is this intended to work like this? Assuming yes, there would be a bunch of different individuals with same accuracy, so the ranking may not be that meaningful.

Or am I doing something wrong or misunderstood this or are there other configurations I should consider?

DarkoNedic commented 5 years ago

Well I overlooked that someone (#3) had the same problem. But still after reading the paper, I don't really understand why this is happening.

aqibsaeed commented 5 years ago

Hi,

First to be clear the paper is not mine I implemented the idea presented in the paper. Before using genetic algorithm to optimize the architecture, I would suggest you to look into dataset (if it is correctly labeled and normalized) then train a simple model and make sure the loss is minimizing. Later on try genetic algorithm to optimize the architecture. Again start with simple building blocks to make debugging easier. Importantly, I merely provided a very simplistic architecture, you need to find building block that works on your problem and then use GA to find optimal connections. Hope this helps.

DarkoNedic commented 5 years ago

Thank you for your response. Now I figured out what you've done in the code. I modified the code to be able to add more channels per convolution and I have done some other stuff, like keeping the output size the same per convolution per stage. Thus, I get better accuracies (the right ones) as well as fitness scores. I've forked your repository and there I'll be sharing my modifications.

So, yes, it helped, indeed. Thank you very much!