hatomist / openhaystack-python

Python library to decrypt Airtag reports
Apache License 2.0
30 stars 1 forks source link

Data poins are duplicated each time reports are fetched #1

Closed zakyum closed 3 years ago

zakyum commented 3 years ago

Hello mate, I'm running a fork of your project and just wanted to let you know that the new commit introduced a bug. Namely this line where a tag is given to a data point: https://github.com/hatomist/openhaystack-python/blob/d076f332efc0c32ac403ae629d6853642e7e898e/daemon/main.py#L24

Fun fact (didn't know either before today), Python's hash() function isn't deterministic between program runs as it uses a new random seed each run for security reasons (https://stackoverflow.com/questions/27522626/hash-function-in-python-3-3-returns-different-results-between-sessions). As such, on each run to fetch the data points the same data point gets a new hash and added again, i.e. duplicated...

Would submit a pull request, but it's already getting late and dunno when I'm gonna have time to tidy up and commit this so here is a quick one-liner which fixes the problem.

import md5 from hashlib
.tag('report_id', md5(result['payload'].encode("UTF-8")).hexdigest())

MD5 is deterministic between runs, the fastest of the hash algos and collisions should only happen every 2^64 data points so for our use case I'd say good enough. Might even be trimmed to make it more storage efficient because we won't have so many data points in 7 days to have hash collisions (I think...). But didn't have yet time to think it through so went with the full hash.

Also thanks for making this amazing integration!

hatomist commented 3 years ago

Oh, wow. Didn't know that as well, thank you. I'll fix that today. I've also asked @Sn0wfreezeDev to update his server so we can download only the last N minutes of reports, instead of a week. It would make fetcher runs significantly faster, as we won't need to decrypt all of the reports or create any additional logic to filter out older ones.

Sn0wfreezeDev commented 3 years ago

Hi @hatomist,

great project! We are working on a more official server release. This will include that you can specify the time of reports. But it might be actually that Apple sends always a week. I think there has been an issue with that in the past

hatomist commented 3 years ago

Thank you!

On 8 Jul 2021, at 10:19, Alexander Heinrich @.***> wrote:

 Hi @hatomist,

great project! We are working on a more official server release. This will include that you can specify the time of reports. But it might be actually that Apple sends always a week. I think there has been an issue with that in the past

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

ch0rus commented 3 years ago

Thank you @hatomist and @Sn0wfreezeDev :)

I've been using simple_server (osx running in kvm) with openhaystack-python for about 2 weeks, it's working great :)

I've been trying to find a map plugin for Grafana where I can plot different size circles depending on reported confidence. I'm thinking that would give a good estimate of the real position (where the circles overlap). Haven't found any suitable plugin yet, both Track map and worldmap sizes the circles relative to zoom level

Btw, do you know what the confidence actually means? I'm guessing it's either the reporting phones estimated position accuracy, or the signal strength for the received advertisement, or both :)

Sn0wfreezeDev commented 3 years ago

We are not sure about the confidence, but I think that Apple drops reports with a lower confidence and shows only reports with a high confidence

hatomist commented 3 years ago

Hi! I am currently working on a Telegram bot which should be much more friendly for an average user to host and/or use. I’m not entirely sure if I’d be rendering maps to a png or just hosting an interactive leaflet map, but it would be possible to add supported by Folium objects there, like Circle which will render a circle (in meters, not relative to the zoom level). Though I don’t a relation between accuracy value and circle radius :(

On 14 Jul 2021, at 14:19, ch0rus @.***> wrote:  Thank you @hatomist and @Sn0wfreezeDev :)

I've been using simple_server (osx running in kvm) with openhaystack-python for about 2 weeks, it's working great :)

I've been trying to find a map plugin for Grafana where I can plot different size circles depending on reported confidence. I'm thinking that would give a good estimate of the real position (where the circles overlap). Haven't found any suitable plugin yet, both Track map and worldmap sizes the circles relative to zoom level

Btw, do you know what the confidence actually means? I'm guessing it's either the reporting phones estimated position accuracy, or the signal strength for the received advertisement, or both :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

zakyum commented 3 years ago

I've been trying to find a map plugin for Grafana where I can plot different size circles depending on reported confidence. I'm thinking that would give a good estimate of the real position (where the circles overlap).

@ch0rus Have been thinking about the same, but you'd have to do at least some merging/filtering of raw location reports as otherwise you'd have too many overlapping circles for static locations. I have only 2 iphones in the household and just in the last 6h have accumulated more than 60 reports (my home is remote so can confirm no other iphones present).

image

It's already crowded with just markers, let alone if each was a circle or if you were to be in a area with more iphones. Also the accuracies range from 80 to 170 so there would be all sizes of circle.

Do you maybe have an idea on how to solve this?