Closed sonjageorgievska closed 8 years ago
Let's first try to find randomized addresses, using Alexey's insights.
Maybe, investigate whether there is some correlation between addresses disappearing, and randomized addresses? I mean, for each random address, there should be a non-random address missing.
On Mon, Jun 13, 2016 at 2:48 PM, sonjageorgievska notifications@github.com wrote:
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/sonjageorgievska/Arena/issues/3, or mute the thread https://github.com/notifications/unsubscribe/AQStEhGQOHtWVEhpBD0KUtorIblW0PUlks5qLVGHgaJpZM4I0StL .
Hi here! This is interesting, but better to live it for future work (a lot of theoretical work is required to put it in the present paper). We have a flag for the randomized, I already did some comparison in March. I will try to estimate the number p, so that you can use it directly. This number would also change over months/years, depending on Apple :) More info: the pictures addresses_per_second_nonrandomized and randomized from resultsFromAnalysis Folder. I think the Pearson correlation between both series was quite high ~0.85
Edit: just checked the pearson between detected nonrandomized and randomized addresses per minute. Is is 0.983387 :) Per two minutes: (0.9895045
So, @philiprn, I found out that the ratio randomized/non-randomized is at most 0.225. This means that if you exclude all randomized addresses during calculation of density, then after all calculation is done, you can scale the histogram by 1.225 to make up for the left-out randomized addresses.
One solution is:
After all, we only detect a fraction x of the people, because not everybody has a smart phone and because some people have 2 smartphones. The number x should be found online in some reports or papers. Then our calculations should scale to take into account x, too.