schollz / find3

High-precision indoor positioning framework, version 3.
https://www.internalpositioning.com/doc
MIT License
4.66k stars 364 forks source link

Original Find App appears to be more vastly more accerate #61

Closed Thomas499 closed 6 years ago

Thomas499 commented 6 years ago

The original Find app appears to be significantly more accurate currently during real world test. I am wondering if it takes the a.i. a few days to figure out what to prioritize?

Is there any chance we can expect any neural networking code anytime soon? I am debating on if this is considered a.i. or just computer learning.

schollz commented 6 years ago

Could you please share the database that is performing better with the original FIND? I'd like to see how much better it is performing, and why. Its not intuitive to me that this could occur, but I suppose it is possible.

What do you mean neural networking code?

Thomas499 commented 6 years ago

Database found original find app can be found https://ml.internalpositioning.com/dashboard/knewtek_test I noticed that the original find app can be trained in 30 seconds and regardless if there are a lot of data points, or few data points, the original find app appears to be more accurate (in real testing, not the prediction accuracy shown on dashboard). I tried to include the file of all the data on the database for this experiment in the attachment but github apparently does not support the format. You can download the data from database by clicking on this link https://ml.internalpositioning.com/data/knewtek_test.db

The database for the find3 app that should have very similar data. I do believe the orignal find server takes longer to calibrate and will not accept incoming data until calibration is complete, which is why the find3 server has more data points. I also noticed we crashed the original find server a few days ago by accident which is another reason the find3 has more data points stored.

Still the data-points in both databases are all good. Yet the original find appears to be much more accurate in all test in real life.

The find3 database which is what I used to represent the find3 for this experiment can be found https://cloud.internalpositioning.com/view/dashboard/knewtek_test Notice the find3 in this experiment has 18971 data points. This should be more than enough data-points for the find3 to work as intended correct?

By neural netowrking code I mean something like https://www.youtube.com/watch?v=aircAruvnKk&t=1003s which allows the program to have freedom of processing the code constantly and becoming more accurate over time. Also it doesn't require humans to tell the code what to look for.

It would be really cool if the code would play a game against itself and train itself to properly identify the real location based on the readings similar to how go zero worked. https://www.theverge.com/2017/10/18/16495548/deepmind-ai-go-alphago-zero-self-taught go zero was a neural network.

Thomas499 commented 6 years ago

If you are going to do a neural network, I suggest adding a few more perimeters. For example, first I'd recommend telling the system which rooms (if known) the wifi modems are in. This is especially true if the neural network will work across other families and give itself "basic instincts" that will help it develop to the family.

I would also suggest randomizing the data the neural network learns from so it doesn't identify if a reading is likely in a room, then the next reading is likely in the same room. I noticed that logic appeared to be the case when training. I also noticed someone else stated they felt the same was happening. https://github.com/schollz/find3/issues/55

I will build a device that allows the a.i. to change power output on the wifi modem broadcast power if you want to try to try to add that functionality in.

Another good thing may be to clarify if the wifi signal is 2.4 GHz or 5 GHz. Those do have different ranges and signal strengths. Again this information would be more helpful if the neural network was to create basic instincts for the rules regarding each family. If the data transfer rate was a factor that information would be much more helpful.

I can list a dozen or so other recommendations if you are interested regarding the perimeters you might want to store, and the rules on how the a.i. game that it plays against itself might be scored to help it learn.

schollz commented 6 years ago

Lets move the NN to another thread.

What do you mean real-world tests? When I looked at your databases (the ones with > 100 points) they both have fairly good statistics.

Thomas499 commented 6 years ago

Yes they have good statistics as shown by the dashboard, but in real-world testing (results produced by real test not computer simulations) the orignal find app appears to be vastly more accerate in telling me the name of the correct room that I am in.

This is true for the following test:

  1. Low amount of data sets. (5-9 per location) results: original find app pretty good at predicting correct location. Find3 terrible. 2; high amount of data sets (200+ per location) results: orginal find app upwards of 80% correct at predicting correct location. Find3 40-50%
  2. high amount of data sets (500+per location) using 12-20 wifi modems during training phase, then eliminating all wifi signals from being used, except the 6 modems that are actually in the house. Results: original find still pretty good at predictions >60%. find3 less than 20% correct. - Right now I using a dataset that i trained with all wifi signals and have disabled all but the ones that are generated in the house from being used. I am in the den. Original find app says i am in den. Find3 says I am in 2nd child's room which is on another floor and on the opposite side of the house.

If the find3 a.i. says the locations have fairly good statistics, then that is probably why the a.i. believe the location to be true. But real test show that what the find3 believes to be correct is not nearly as accurate as the original find - based on my personal observations in this building.

schollz commented 6 years ago

I want to understand more about this scenario, can you elaborate on what are your personal observations? Like, were you able to tabulate how often it was right (for real) in a given location? For this time of estimation I like to write down a time that I was in a room and then later check the tracking data to see how well it has correlated for many data points. I haven't done this with FIND vs FIND3, though.

Thomas499 commented 6 years ago

I did not record the time that I was in a room. I have the custom app set up to train both original find and find3 at the same time. The find3 stores more of the readings because the original find does not store readings while it is recalculating.

I do have a feature setup within the app when running find location where I can tell it which location I am actually in, then hit record, and it records if the predicted location is the real location for both original find and find3. I haven't used this feature much yet, because it is pretty obvious the original find in this building is more accurate in all test so far. But if you want me to record the actual real world accuracy for both original find and find3, I will.

Do you have a preferred method you would like for me to do this with? I.e. low data points, trained in center of each room, trained all over each room, lots of data points, only a few date points taken at a time in each room over the course of multiple days and in different conditions, data trained with multiple wifi modems then elimination of all modems not in the particular building?

schollz commented 6 years ago

Just taking data points in each room like your normally would is sufficient.

schollz commented 6 years ago

I'm going to close this until then

Thomas499 commented 6 years ago

Apologizes for the delay. Somehow walking up the steps did something to my lower back. Feels like i've been shot. Moving around after that was very painful. I hope to get more readings for you tomorrow.

In the meantime here are my notes:

Experiment 1 Outcome

Find3 = 45.83% accuracy

Find original = 44.44% accuracy

Trained for 30 seconds in each location (staying in the middle of the room), taking sample every 4 seconds.

Walked around each room and randomly clicked store button.

Note: find3 server had 516 data points.

Find original had 138 data pints

Experiment 2 Outcome Find Original

Find3 = 53% accuracy

Find Original = 67% accuracy

Trained for an additional 30 seconds in each location taking samples every 4 seconds.

During training walked around instead of staying in same place

Find3 server had 949 data points

Find original had 246 data points

Experiment 3 Outcome Find Original

Used previously collected extensive data.

Collected accuracy of one room for 30 minutes

Find3= 69.88% accuracy

Find original = 89.67% accuracy

Find3 server had (find3 server is down right now will have to look later)

Find original had 6139 data points

On Thu, Apr 19, 2018 at 7:53 PM, Zack notifications@github.com wrote:

Closed #61 https://github.com/schollz/find3/issues/61.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/schollz/find3/issues/61#event-1585042734, or mute the thread https://github.com/notifications/unsubscribe-auth/AkqUSiaAlNon6oua1wmPajzOEQ5abMlaks5tqSODgaJpZM4TYQHw .

schollz commented 6 years ago

Can you share the databases?

Thomas499 commented 6 years ago

Experiment 1 and 2.

find original database https://ml.internalpositioning.com/dashboard/knewtek_new

find3 database https://cloud.internalpositioning.com/view/dashboard/knewtek_new https://github.com/schollz/find3/issues/url

Experiment 3 find original database https://ml.internalpositioning.com/dashboard/knewtek_short find3 database https://ml.internalpositioning.com/dashboard/knewtek_short

I will do more experiments tomorrow and share the results.

On Thu, Apr 19, 2018 at 11:31 PM, Zack notifications@github.com wrote:

Can you share the databases?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/schollz/find3/issues/61#issuecomment-382962157, or mute the thread https://github.com/notifications/unsubscribe-auth/AkqUSrzXecZXUQU_joFpsfRScyfo-2xoks5tqVabgaJpZM4TYQHw .

schollz commented 6 years ago

Thanks, I don't think you need to do more experiments. I can conduct an experiment with your giant dataset (knewtek_short). Basically I took your entire dataset for labelled fingerprints (the learning dataset), randomized them and then used some of them for learning and the others for testing (cross-validation). I ran the exact same fingerprints through both FIND (original) and FIND3 for both learning and testing. You can replicate my results by running an FIND instance on port 8001 and running FIND3 at the default ports (8003) and then running the attached script.

My results show that the original FIND has an accuracy of 87% and FIND3 has an accuracy of 92% for your dataset. These results are very comparable with results for my databases. When I run the same test on my own dataset (the one I'm currently using with passive scanning), FIND has an accuracy of 77% while FIND3 has an accuracy of 91%. When I ran against another dataset I did with active learning I got an accuracy of 100% with FIND and accuracy of 98% with FIND3. Basically, when I look at your dataset, and my datasets, FIND3 is either just as good, or better than FIND.

I have no idea how you got those percentages between FIND and FIND3 nor what exactly you were measuring.

I'm going to close this issue. If you'd like to discuss further, I suggest moving to the slack chat.

compare_find_and_find3.zip