mccorby / PhotoLabellerServer

Federated Learning: Parameter Server doing aggregation of updates to a model coming from clients participating in a Federated Learning setup. See also the Android application companion at https://github.com/mccorby/PhotoLabeller
MIT License
50 stars 13 forks source link

Initial model generation #3

Open SawsanAbdulRahman opened 5 years ago

SawsanAbdulRahman commented 5 years ago

On the client side, when running the application, I can see that the initial model used is the one located in assets folder. If we create the initial model on the server (PhotoLabellerServer/model/src/main/kotlin/com/mccorby/photolabeller/ml/Main.kt), and then we use the generated model in the app, the model misbehaves; The prediction for the images as well as the scores at each iteration when training the new model show a NaN values.

So how the initial model in the assets folder has been generated ?

mccorby commented 5 years ago

Hi, Thanks for trying this project. I have just train the model and deployed it in the app and could it see it working as expected (at least the prediction) Does the model work on the server? I mean, does the predictions work on this project with the model?

If it does, could you try training the model again? The system is not very stable (please note this is a PoC) due to the use of images in Android but those NaN are not expected

If nothing else works, could you send me the model to at gmail?

SawsanAbdulRahman commented 5 years ago

I figured out that the server sometimes generate good models (cifar_federated of 4825 KB) that works fine on the mobile device and sometimes it doesn't (cifar_federated of 7KB) where it shows NaN values on the clients. Yet in both cases, the eval function on the server side doesn't work, i get the following error: =====eval model======== Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.intellij.rt.execution.CommandLineWrapper.main(CommandLineWrapper.java:67) Caused by: java.lang.IllegalArgumentException: bound must be positive at java.util.Random.nextInt(Random.java:388) at org.nd4j.linalg.util.ArrayUtil.buildInterleavedVector(ArrayUtil.java:1679) at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.shuffle(CpuNDArrayFactory.java:814) at org.nd4j.linalg.factory.Nd4j.shuffle(Nd4j.java:452) at org.nd4j.linalg.dataset.DataSet.shuffle(DataSet.java:619) at org.datavec.image.loader.CifarLoader.convertDataSet(CifarLoader.java:380) at org.datavec.image.loader.CifarLoader.next(CifarLoader.java:424) at org.datavec.image.loader.CifarLoader.next(CifarLoader.java:392) at org.deeplearning4j.datasets.iterator.impl.CifarDataSetIterator.next(CifarDataSetIterator.java:110) at com.mccorby.photolabeller.ml.trainer.CifarTrainer.eval(CifarTrainer.kt:100) at com.mccorby.photolabeller.ml.MainKt.main(Main.kt:39) ... 5 more

koustabhdolui commented 5 years ago

Hello @mccorby and @SawsanAbdulRahman. Great work @mccorby to implement Federated Learning. I came across your article on proandroiddev. I am trying to work with this repository. However, I am unable to make it run. Can you suggest what configuration I should choose on IntelliJ? com.mccorby.photolabeller.ml.MainKt as the main class does not work. I get an error saying com.mccorby.photolabeller.ml.MainKt is not found in module "PhotoLabellerServer".