openatv / enigma2

openatv-gui
GNU General Public License v2.0
200 stars 317 forks source link

InitGeolocation: run at every boot / incorrect info via VPN #1563

Closed Azureit closed 4 years ago

Azureit commented 4 years ago

What is the need to run this at every boot when we already manually configured language/timezone? And this will produce incorrect info if I VPN my internet.

IanSav commented 4 years ago

The code runs to initialise the geolocation data. The data is available to any Enigma2 code that wants to use it.

The code has been updated to no longer return any data if a VPN is detected.

Azureit commented 4 years ago

@IanSav Why not cache the data and reuse next boot.

IanSav commented 4 years ago

Some of the data captured may change. The only way to validate the cache would be to do another lookup. ;)

ghost commented 4 years ago

Why not cache the data and reuse next boot.

I manifested already various concerns about this

I also wonder if a discussion about this took place before, I hope that geolocation implementation is not a single contributor point-of-view.

Azureit commented 4 years ago

@Tony-il-Capo @IanSav Running a tracker at every boot is privacy concern and in my case will be blocked by captive portal & firewall. To make things worst it's http.

For me the code is good at first boot to assist the configuration, and to cache the results for future use cases. Then the only thing that should change is the IP, but for what!?

The only use of this would be if I was going around the world with my enigma2 box, then I will like it to update my timezone and possible the weather app location :) but then again not the language and the location is not very accurate.

IanSav commented 4 years ago

Here is the Privacy Policy for the site I am using in the code. I see no malicious intent or any data collection.

This is NOT a tracker. It is a data lookup service. The only tracking is to hold an IP address, in RAM, for 1 minute to ensure that the service is not being abused.

For the moment, once the user sets a time zone this code makes no further changes. I am expecting to see requests from users who travel with their receivers to want the time zone to update as they move from place to place. This capability can easily be added with this new infrastructure.

Exact location accuracy is not particularly important for things like time zones. The information can always be corrected by the user in the Enigma2 user interface. Users can also send updates and corrections to the geolocation service if they wish.

ghost commented 4 years ago

I see no malicious intent or any data collection.

Even if the server doesn't log any data, it still a privacy issue that you refuse to see and keep ignoring that this creates unwanted mistrust in this project. ip-api.com quote

Enrich the data about your users with the city, region, zip, country, continent, timezone, latitude and longitude fields Find out if users are using anonymous proxies or VPN services Identify the organization, ISP, AS, and if they are using a mobile network

and then sends all that openly over internet. It really sounds like tracking to me.

This should have been implemented in a total different way, first create means of user to give consent or not, starting from the very first fresh E2 (setup) boot, and only then use this "service".

IanSav commented 4 years ago

Please specify exactly what privacy issues are being caused in this case. In what way is the privacy being disclosed? What private data is being provided to this site?

Tracking implies that the data is captured and used for some other purpose. I see no evidence of that here. This is simply an enquiry to deliver publicly available core Internet information back to the initiating request. No data is collected from the user and there is no evidence that any data is sent elsewhere.

Please detail your proposed solution to achieve the required result within the Enigma2 context.

Given your concerns about privacy I assume that you don't use the Internet at all. ;) I suspect that your posts here attract far more tracking and data retention using information that you have provided than anything being done in the geolocation lookup. Remember that all you are providing is your current WAN IP address. By definition this information is provided for every connection you make on the Internet!

Azureit commented 4 years ago

My concerns are:

Please explain me why this should run at every boot when I have all configured timezone/language... ? This geolocation lookup should be initialized when needed, not at every boot.

vladc commented 4 years ago

I'm from the staff that operates ip-api.com, a service that launched in 2012. We're a real company in EU, a region with many privacy laws and GDPR. Our privacy policy is accurate. We do not have any logging for our endpoint at ip-api.com, we answer more than 1.5B requests every day and we have no analytical use for the IP addresses being looked up. As IanSav mentioned, we hold the calling IP address in RAM, for 1 minute to ensure that the service is not being abused (rate limiting), that's all.

Azureit commented 4 years ago

@vladc Thanks for taking the time, most appreciated. http is unsafe for this, and you can agree with me, that's why https is not for Free API, and the issue is bigger because most of the Free API servers are outside EU.

vladc commented 4 years ago

@Azureit while SSL is generally always better, I don't think we can call http unsafe for this. If the code that parses our response has no vulnerability, I'm unsure what impact, if any, a MITM attack can have. Your IP address will always be made public, SSL or otherwise, and the response we return is available for anyone as long as they have your IP address.

ghost commented 4 years ago

@IanSav why can't you understand a simple thing, give power to user enable/disable this and the problem is solved. Both sides happy. That way you cover all situations.

E2 has worked for years without this, is not like this is mandatory.

You seem to always trying 'to reinvent the wheel' here and then, defend a square one at all costs. It's not the first time you push very hard for changes in E2 code ignoring even senior contributors opinions. This is a community, we all should be open to suggestions and different ideas, and then try to find a right balance.

As this thing go, I'm out, do whatever you want, I have like ten ways to make sure my boxes never use that 'service', but I feel really bad for the users that are unaware of what is going on.

Bye.

IanSav commented 4 years ago

@Tony-il-Capo I am sorry you feel this way. I am forced to conclude that you are scare mongering and/or you don't understand how networking connections work. All connections to the Internet work by machines making initial unsecured connections to set up the secured connection paths. Each and every one of those unsecured connections have your IP address and probably much more other information. In this case only your current WAN IP address is used. You should also note that I do not provide the IP address it is derived from the connection.

In this case there is a one to one connection from the local / home network to ip-api.com to get the information. There are no redirections or other sites involved. The data interchanged does not have any advertising or virus payload. Even if there was advertising or a virus sent it is not read, processed or used!

If you don't like my contributions to Enigma2 then please raise your issues with Captain and get him to speak to me. It would be significantly easier for me to abandon OpenATV and only provide my fixes and improvements to the other images. I am simply including OpenATV in my developments in response to requests and discussions with Caption.

atvcaptain commented 4 years ago

@ all i see no Privacy break, only the ip send to the server was used no other information was used, every connection send the ip to the server, and a secure connection not needed too the ip is not a secret part. location infos are allways presend in most of all other systems, this help user get currect weahter infos, or pc browser show infos for your langauge all use the public ip.

IanSav commented 4 years ago

@Azureit:

  • http request can be packet sniffed;

There is no personal data in the request. It is a simple URL with nothing other than the requested data flag. There is nothing of value or information to be sniffed.

  • I don't trust this companies data retention and usage of the data;

As you don't provide any data I don't see how this is a concern. The company is simply returning accumulated public information derived from your ISP.

  • fingerprinting do to "run at every boot", and http request query and headers.

I don't understand this concern.

  • I don't want ads because I have a enigma2 box and timed when I turn on the box.

There are no ads or any other payload other than the requested json data. Any attempt to inject any ads, viruses or any other other data will cause the geolocation lookup to fail and return no data.

Please explain me why this should run at every boot when I have all configured timezone/language... ?

This point has already been covered.

This geolocation lookup should be initialized when needed, not at every boot.

The information provided by the lookup could change from boot to boot.

I have thought about this feature for some time. I did not add it on a whim. It is there to assist users with their system configuration. As all the information is triggered from an open and public IP address and the data returned is also from freely available information I considered this to not create a security vulnerability.

The fact that I am continuing to engage in these discussions should be an indication that I take security concerns seriously and stand ready to address any real problems with the code.

atvcaptain commented 4 years ago

@IanSav but we have a major issue

box boot and show wrong clock

after reboot clock was ok but after some time the clock change 1 hour

Zeit 1

Zeit 2

we send log infos private

and we dont must set the timezone on every boot, normal only in the startwizard, user get from the wizard matching defaults for his location, like timezone, ui language, weather location , user can used it or used his own settings

atvcaptain commented 4 years ago

at the moment all deep recording are broken we need soon a solution

nickersk commented 4 years ago

tz is always falling to UTC despite if set manually...

IanSav commented 4 years ago

There seems to be an error and disparity between the "/etc/localtime" and Enigma2 timezone settings. If the user goes into the Enigma2 Timezone settings UI and selects the correct timezone the issue should go away.

Are the any users having the issue for old timezones other than UTC?

nickersk commented 4 years ago

so, just now i manually set to europe slovakia, and restart gui. after restart in menu there is germany berlin and in etc link is /etc/localtime -> /usr/share/zoneinfo/Europe/Bratislava (which is correct)

nickersk commented 4 years ago

and now after some time it jumped to slovakia

IanSav commented 4 years ago

I am also in Skype with Captain trying to understand the issue.

nickersk commented 4 years ago

after full reboot lrwxrwxrwx 1 root root 23 Mar 18 06:54 /etc/localtime -> /usr/share/zoneinfo/UTC and in menu germany/berlin :-)

IanSav commented 4 years ago

A patch has been committed that should address the time zone issues while we try to find and fix the cause.

ghost commented 4 years ago

@Tony-il-Capo I am sorry you feel this way. I am forced to conclude that you are scare mongering and/or you don't understand how networking connections work. All connections to the Internet work by machines making initial unsecured connections to set up the secured connection paths. Each and every one of those unsecured connections have your IP address and probably much more other information. In this case only your current WAN IP address is used. You should also note that I do not provide the IP address it is derived from the connection.

:rofl: Are you serious?!! Now you treat end users as very stupid people? I really can't believe that you now tried to lecture me on how internet works. Unbelievable.

It's like a stone wall, this all conversation could be resumed to:

Serious, a little more modesty and understanding would be appreciated.

FYI there are VPNs/proxy detected as non proxy (direct connection)...so geolocation data is used but is wrong eg some VPN/TOR exit nodes popular for accessing georestricted services.

IanSav commented 4 years ago

@Tony-il-Capo I don't believe the rubbish you have been spouting. Your arguments defy how Internet connections work. It is demonstrating some serious issues with your scope of knowledge. When I asked you to provide constructive suggestions on how to realistically address your concerns you simply respond with insults. A totally unproductive logic that will not convince me or anyone else to change things.

Did you raise your concerns with OpenATV team leaders? I suspect not as I am working with them on a bug in OpenATV that is causing some of the time zone issues. They are very keen and happy to have the geolocation code.

When someone gives me log files that demonstrate real issues I will address them. Some serious debugging is currently happening with Captain and NickerSK trying to find the real cause of the time shift issues on OpenATV. A work around has already been merged and fixes the issue but we all want to find the underlying cause of the bug. It does not appear to be coming from any of my code. Something is changing /etc/localtime and resetting it to UTC without any user intervention.

I don't know why am am telling you the real problem as you are already convinced that I don't know anything and am just wasting everyone's time. How much code have you contributed to Enigma2?

By the way, attempting to bully me is not going to change my mind that the issues about which you complain are non issues. When you provide some facts, logic and/or detail to your complaint then you may get a more favourable hearing.

ghost commented 4 years ago

@IanSav Insults?? Where? Telling you about a different perspective is insulting you? I'm not sure what you want me to do or say?? You now just keep saying I know nothing, ok then! I already told you many times that geolocation should be optional. Should be me implementing it? Well, I'm not a (python) programmer, I just do programming for personal use only.

I think was clear already that here in my community there are several E2 users that have no background knowledge..., I set up most of them in the past using OpenATV and they learned essential/basic GUI usage. Now it seems I've made the mistake of updating most of them. I'm here passing the issues I here from them, do you want me asks those people to debug? Really??? You probably think that every E2 user is here on github!

Believe it or not, for me personally, as I told you before I don't care anymore. I came here (again) believing it was the right thing to do, with best intentions, but I cannot reason with closed mind self-centered people that only care and see what's inside their bubble. I now unsubscribed notifications to this, I'm really tired of it.

Azureit commented 4 years ago

@atvcaptain

@ all i see no Privacy break, only the ip send to the server was used no other information was used, every connection send the ip to the server, and a secure connection not needed too the ip is not a secret part. location infos are allways presend in most of all other systems, this help user get currect weahter infos, or pc browser show infos for your langauge all use the public ip.

This geolocation run at every boot, can be used to fingerprint the user home IP with the use of enigma2 box and the time the user is at home and boot the box. With that info a ads provider, running at a site the user visits, know that user home IP should have ads about, satellite equipment, cable companies, iptv packages... What's the use of initializing geolocation info at every boot when we have manually configured all? This should be initialized when and if needed. At least convince @IanSav the #unmoveblerock to use a far better opensource alternative https://geoip.fedoraproject.org/city The source to run this service is at https://github.com/fedora-infra/geoip-city-wsgi , even openatv could host this service. I'm advocating for better privacy in this opensource project not for me, that I'm aware of the problems, but to everyone else. I can't imagine for example fedora doing this at every boot, just for the lulz.

IanSav commented 4 years ago

@Azureit I tried the site you suggested and it returned the wrong location for me.

I don't understand how a connection on the Internet creates a "fingerprint". All connections on the Internet share IP addresses. That can't be stopped or hidden. It is the fundamental way that the Internet and Ethernet works.

As for the geolocation call, I don't provide any information other than the data I want returned. I don't accept any data back from the geolocation lookup except for a valid json answer. Anything else returned is just thrown away. (I have never seen anything else returned.)

Neither you nor Tony have provided any specific details about how your privacy is being compromised. What private data is being violated and what information is being leaked from your connection or your receiver? Your public / WAN IP address is very much not private.

Azureit commented 4 years ago

@IanSav

I tried the site you suggested and it returned the wrong location for me.

Report that to the project, for me it gives a accurate position than ip-api.com. They are using Maxmind GeoLite2 City database, the most used database for this type of service.

I don't understand how a connection on the Internet creates a "fingerprint"

The URI of the request http://ip-api.com/json/?fields=33288191, that "?fields=33288191" is one way to fingerprint to your code, another way can be default urlopen useragent, not every application uses python or urlopen... combine that with the WAN IP of the user and you have a fingerprint.

You're still unable to have a valid reason why this should run at every boot ignoring my manually configured settings even if I don't need this, don't want this. Before running this geolocation code, check if I have manually settings, if I don't then run your geolocation code.

atvcaptain commented 4 years ago

you can add to /etc/hosts 127.0.0.1 ip-api.com and api stop working

Azureit commented 4 years ago

@atvcaptain Can you please add a Privacy ON/OFF button in the interface, to do that. Thanks

IanSav commented 4 years ago

@Azureit if it were possible to add a UI control to control this function I would have done it as part of the initial code implementation.

The mytest.py initialisation script for Enigma2 is quite disorganised and chaotic. There is no UI and the config engine is not available until considerably far down the code. I intend to address this at some point in the future but for now what you ask is not yet possible. The suggestion given by by @atvcaptain should work for you to make the call non functional. (It does not need or use a UI or the Enigma2 config engine.

The data "?fields=33288191" in the query is not a fingerprint. It is simply a bit map that describes the data I want returned from the geolocation service. There is absolutely nothing about you in that data. ALL Enigma2 users on all images are using exactly the same lookup request.

For your information here is how that number is determined: https://ip-api.com/docs/api:json. Feel free to explore the number generator on that page to see how the 33288191 number was generated. Again there is nothing personal in this number and it is not a fingerprint that makes your query and more identifiable that any other user of that site.

Azureit commented 4 years ago

@IanSav

The data "?fields=33288191" in the query is not a fingerprint. It is simply a bit map that describes the data I want returned from the geolocation service. There is absolutely nothing about you in that data.

I know "?fields=33288191" is static not about me, it fingerprint every enigma2 with your code, to associate with me they have my WAN IP.


I did not test this, but consider how you could change this to not run at every boot and respect manual settings

at mytest.py

profile("Geolocation")
# Not run at every boot
#import Tools.Geolocation
#Tools.Geolocation.InitGeolocation()

at Geolocation.py

def InitGeolocation():
    global geolocation
    # Do lookup if not done already, or if info is older than a day
    if "lastchecked" not in geolocation || time.time() - geolocation['lastchecked'] > 86400 :
        try:
            response = urlopen("http://ip-api.com/json/?fields=33288191", data=None, timeout=10).read()
            # print "[Geolocation] DEBUG:", response
            if response:
                geolocation = loads(response)
            status = geolocation.get("status", None)
            if status and status == "success":
                print "[Geolocation] Geolocation data initialised."
                # Timestamp of last check
                geolocation["lastchecked"] = time.time()
            else:
                print "[Geolocation] Error: Geolocation lookup returned a '%s' status!  Message '%s' returned." % (status, geolocation.get("message", None))
        except URLError as err:
            if hasattr(err, 'code'):
                print "[Geolocation] Error: Geolocation data not available! (Code: %s)" % err.code
            if hasattr(err, 'reason'):
                print "[Geolocation] Error: Geolocation data not available! (Reason: %s)" % err.reason
        except ValueError:
            print "[Geolocation] Error: Geolocation data returned can not be processed!"

def RefreshGeolocation():
    # Refresh if needed
    #global geolocation
    #geolocation = {}
    InitGeolocation()

Then at Timezone.py

def InitTimeZones():
    # if the timezone value not set or set to a invalid one
    if not config.timezone.val.value or timezone_not_valid:
        # Initialize or refresh geolocation
        Tools.Geolocation.RefreshGeolocation()

        tz = geolocation.get("timezone", None)
        proxy = geolocation.get("proxy", False)
        if tz is None or proxy is True:
(...)
IanSav commented 4 years ago

I know "?fields=33288191" is static not about me, it fingerprint every enigma2 with your code, to associate with me they have my WAN IP.

No, it is simply a list of requested fields. ALL queries that ask for all fields will use the same number. This is in no way Enigma2 specific. The default data set returned is not good enough so I need to provide the list of fields I want to receive.

Thank you and I appreciate your effort to demonstrate your code proposal. The code suggestion presumes that the geolocation data is only used for the time zone. That is currently true as the feature has only just been added but will not hold true in the future. I am aware of desires to use geolocation data elements for other uses. I think I heard that it would be helpful for obtaining general location for weather data. There may well be other uses for the data.

I understand your desire not to use geolocation data so please follow Captain's advice and block the call on your receiver. In the future when mytest.py is refactored I hope to be able to offer a UI based control to manage the geolocation lookup.

Azureit commented 4 years ago

The code suggestion presumes that the geolocation data is only used for the time zone. That is currently true as the feature has only just been added but will not hold true in the future. I am aware of desires to use geolocation data elements for other uses. I think I heard that it would be helpful for obtaining general location for weather data. There may well be other uses for the data.

I know geolocation will have other uses in future, but any code requesting gelocation.get("WHATEVER") can first call Tools.Geolocation.RefreshGeolocation() to initialize or refresh geolocation dict, it makes more sense.

If your code in the future select the language, please respect people choice if they have manually set a different language that your geolocation data is telling you, same thing for the weather app.

I know you code is not finished, but as of now it is not respecting manually configured settings.

IanSav commented 4 years ago

@Azureit the solution you suggest to use geolocation calls just before use could initiate even more calls to the geolocation server. If you have issues with one call I am certain you will have a lot more issues with many repeated calls.

I am not expecting to use the geolocation code to set the default language. The absolute default locale, by definition is "en_US". That is how Enigma2 is meant to be written. All other languages require translation. That includes "en_GB" which is the general default language offered by most images. OpenATV defaults to "de_DE".

IanSav commented 4 years ago

@jose-cruz78 why do you believe this is out of hand?

These changes were specifically made to enhance the user experience and to allow Enigma2 to make more user friendly initial choices for you. Apparently there are users who do not appear to be happy to configure some aspects of their receivers. This code tries to make better initial guesses for them. Once the initial guess is made the geolocation data is no longer used. (As I have said a number of times now, the current implementation of Enigma2 does not allow sufficient control at the very early stages of Enigma2 startup. When I, or another developer, changes that then users will be able to have much more control on how their receivers start up.)

You and others seem to think that I am leaking private information out of your receiver. That is not the case. All this geolocaction call does is collect public information that is already available about your IP address for your use. NOTHING is sent from your receiver other than a completely anonymous request to have publicly available information about your WAN connection returned to you.

I bet the few of you raising concerns continue to use the web, social media and other Internet based activities. How you directed these sort of questions and accusations to all those providers? Believe it or not most people on the Internet are performing significantly more intensive geolocation calls on your behalf. They are combining that with other data, often provided by you, to do significantly more than offer a suggested time zone for your receiver to use. (Note that on Enigma2 this geolocation code does an anonymous geolocation call and collects the results itself. The results are not kept or used for any other purpose that to initialise some functions on your receiver.)

Next time you surf the web with your private browsing options turned on, your cookies turned off and the connection set to its most private settings please tell why the website you ace connecting to knows to present the page with your country's flag and your local time. It may even use a language of your country even if your browser's language is different. The reason is that the website you are using is doing a 3rd party geolocation lookup of your connection. Are you complaining to all those websites about invasion of your privacy?

The only reason this thread is getting out of hand is that some users want to believe that I have in some unspecified way taken their private information and leaked it to the Internet. That is simply NOT true. Perhaps I am encouraging the "mess" by being open and communicative and trying the explain exactly what is going on. (Apparently to little avail as people simply don't want to listen or understand me. Everything I am saying is fully verifiable by looking at the Enigma2 open source code, reading the privacy statement of the geolocation provider I am using and researching geolocation on the Internet.)

IanSav commented 4 years ago

"None so deaf as those that will not hear. None so blind as those that will not see." (Matthew Henry)

IanSav commented 4 years ago

@jose-cruz78 no user has yet presented a valid case of data compromise.

You just joined GitHub,three hours ago, why did you do that? You don't seem to be concerned that you provided more information to them to create that account than I used to do the geolocation lookup!

Is your receiver connected to the Internet? Is your receiver getting its EPG from the Internet?

IanSav commented 4 years ago

So have you complained to all the Internet connections providers you have made that did 3rd party geolocation lookups on you? Trust me that they did look you up. They didn't tell you, they didn't get your consent. They didn't give you an option to opt out.

IanSav commented 4 years ago

Internet yes, no epg from internet. I told you, I was pointed here, didn't know github before.

You created an account and gave GitHub (owned by Microsoft) far more information than my geolocation code returned to you!

Azureit commented 4 years ago

Let's all calm down please, inflamed comments don't lead us nowhere.

@IanSav

@Azureit the solution you suggest to use geolocation calls just before use could initiate even more calls to the geolocation server. If you have issues with one call I am certain you will have a lot more issues with many repeated calls.

Look at the patch at my comment, I even comment it:

def InitGeolocation():
    global geolocation
    # Do lookup if not done already, or if info is older than a day
    if "lastchecked" not in geolocation || time.time() - geolocation['lastchecked'] > 86400 :
        try:
            response = urlopen("http://ip-api.com/json/?fields=33288191", data=None, timeout=10).read()
            # print "[Geolocation] DEBUG:", response
(...)
IanSav commented 4 years ago

People need to get a sense of proportion here. Geolocation is a fact of life. The code I used is clear and open. Nothing is hidden. No private information is obtained or provided to the geolocation provider. The information that is returned is completely public and readily available.

If I have not clearly documented the addition of the geolocation tool I suggest that none of this discussion would occur. People would never have noticed notices it. I am being very clear and open about what is going on.

IanSav commented 4 years ago

@Azureit the data can change with every connection. The only way to validate the cache would be to perform another lookup. That defeats any gain by caching the data. In this case code to do all the work simply becomes bloat.

IanSav commented 4 years ago

@jose-cruz78 I have nothing to defend. I am simply being polite and responding to address your concerns. By the tone and nonsense of your last post I will take it that you wish me to ignore your posts.

Azureit commented 4 years ago

@Azureit the data can change with every connection. The only way to validate the cache would be to perform another lookup. That defeats any gain by caching the data. The opposite is true, the code to do all the work simply becomes bloat.

It's not caching we do the request to populate the geolocation dictionary and add an extra key "lastchecked" with the current timestamp and then next time geolocation is needed we check the key to know we already have fresh geolocation and skip doing another request.

IanSav commented 4 years ago

What you propose is the definition of caching!

IanSav commented 4 years ago

Three people have now negatively commented on the geolocation change. Why can't any of you demonstrate how privacy is being compromised by this code?