Exodus-Privacy / exodus-core

Core functionality of εxodus
GNU Affero General Public License v3.0
18 stars 17 forks source link

Tests fails because https://matlink.fr cannot be accessed #8

Closed FrancoisDupayrat closed 3 years ago

FrancoisDupayrat commented 6 years ago

Hello,

I tried running tests, but one of them fails as https://matlink.fr/token/email/gsfid which is hardcoded at line 269 of exodus_core/analysis/static_analysis.py cannot be accessed.

Is there another working URL that could be used instead ? Please note it's also the one used in default gplaycli.conf, meaning the project won't work out of the box.

I tried setting (by using this example: https://github.com/matlink/gplaycli/blob/master/example_credentials.conf):

        gpc.token_enable = False
        gpc.gmail_address = 'gmail_address'
        gpc.gmail_password = 'password'

But then I get ERROR:root:'GPlaycli' object has no attribute 'token_url'

Here is the full test logs with un-modified exodus-core source code:

python3 -m unittest discover -s exodus_core -p "test_*.py"
invalid decoded string length
invalid decoded string length
invalid decoded string length
..WARNING:root:Unable to get the icon from the APK - downloading from details
ERROR:gplaycli.gplaycli:cache file does not exists or is corrupted
ERROR:root:HTTPSConnectionPool(host='matlink.fr', port=443): Max retries exceeded with url: /token/email/gsfid (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113606630>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
WARNING:root:Unable to get the icon from details - downloading from GPlay
ERROR:root:Unable to download the icon from Google Play
ERROR:root:Unable to download the icon
ERROR:root:Unable to save the icon
F...
======================================================================
FAIL: test_icon_diff (analysis.test_exodus_analyze.TestExodus)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/FrancoisDupayrat/Documents/exodus-core/exodus_core/analysis/test_exodus_analyze.py", line 52, in test_icon_diff
    self.assertEqual(phash_4, 325352301465779383961442563121869825536)
AssertionError: '' != 325352301465779383961442563121869825536

----------------------------------------------------------------------
Ran 6 tests in 71.067s

FAILED (failures=1)
U039b commented 5 years ago

Thank you. I will investigate.

counter-reverse commented 5 years ago

The hardcoded url you found on the code is used by default on the github tutorial to configure the development environment to build the software sources: https://github.com/Exodus-Privacy/exodus-core. I use it and I do not have the following error:

ERROR:gplaycli.gplaycli:cache file does not exists or is corrupted
ERROR:root:HTTPSConnectionPool(host='matlink.fr', port=443): Max retries exceeded with url: /token/email/gsfid (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113606630>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
WARNING:root:Unable to get the icon from details - downloading from GPlay.

My configuration file contain:

token=True#very important
token_url=https://matlink.fr/token/email/gsfid

I think token=True may fix it.

Also, in my opinion the problem does not come from only the url you gave and your file configuration.

Consider the errors "downloading from details" and "downloading from GPlay" as very different errors. There may be two problems: the file configuration you talked about and the image hashing (second trouble you met in the third logs). I am going to explain why. Now, let's focus on the second trouble you met.

First step: read the compiling error you showed:

"File "/Users/<anonymized>/Documents/exodus-core/exodus_core/analysis/test_exodus_analyze.py", line 52, in test_icon_diff
    self.assertEqual(phash_4, 325352301465779383961442563121869825536)"

We can read the exact file and line where the error happend. Let's check it out.

    def test_icon_diff(self):
        phash_4 = phash('./apks/nextcloud.apk')
        self.assertEqual(phash_4, 325352301465779383961442563121869825536)
        phash_5 = phash('./apks/francetv.apk')
        self.assertEqual(phash_5, 277543533468213633177527091973989793792)
        phash_1 = phash('./apks/braiar.apk')
        phash_2 = phash('./apks/whatsapp.apk')
        phash_3 = phash('./apks/hsbc.apk')
        sa = StaticAnalysis()
        diff_1 = sa.get_icon_similarity(phash_1, phash_2)
        diff_2 = sa.get_icon_similarity(phash_1, phash_1)
        diff_3 = sa.get_icon_similarity(phash_2, phash_1)
        diff_4 = sa.get_icon_similarity(phash_2, phash_2)
        diff_5 = sa.get_icon_similarity(phash_1, phash_3)
        diff_6 = sa.get_icon_similarity(phash_2, phash_3)
        self.assertEqual(diff_1, 0.7265625)
        self.assertEqual(diff_1, diff_3)
        self.assertEqual(diff_2, 1.0)
        self.assertEqual(diff_2, diff_4)
        self.assertNotEqual(diff_5, diff_6)

Well. Let's focus on the first lines.

        phash_4 = phash('./apks/nextcloud.apk')
        self.assertEqual(phash_4, 325352301465779383961442563121869825536)

The error comes from the method to check if the files used are correct. It works with an hash algorithm and compare the hash of the apk image given with an hard coded hash to determine if the image found is the right one. If the image is not the same, then a assertion (error) is raised. It is actually what happend. But what does the function phash() do? And how to fix it? I suggest to investigate.

line 9 of the file test_exoddus_analysis.py:

def phash(apk):
    sa = StaticAnalysis(apk)
    return sa.get_icon_phash()

StaticAnalysis(apk) is a class from the module made and imported in the class StaticAnalysis from the file exodus_core/analysis/static_analysis/

take a look at line 369 of exodus_core/analysis/static_analysis/:

    def get_icon_phash(self):
        """
        Get the perceptual hash of the icon
        :return: the perceptual hash, None in case of error
        """
        import dhash
        from PIL import Image
        dhash.force_pil()  # Force PIL
        with NamedTemporaryFile() as ic:
            path = self.save_icon(ic.name)
            if path is None:
                logging.error('Unable to save the icon')
                return ''
            try:
                image = Image.open(ic.name).convert("RGBA")
                row, col = dhash.dhash_row_col(image, size = PHASH_SIZE)
                return row << (PHASH_SIZE * PHASH_SIZE) | col
            except IOError as e:
                logging.error(e)
                return ''

Algorithmically we can now globally see what happend in the program from the code or simply by the comment made to document the code:

  1. an image is supposed to be read from the hard disk of the user.

  2. if the path is none RETURN A STRING REPRESENTED BY NOTHING

  3. try to:

  4. open an image

  5. get sizes (height and width)

  6. return the row

  7. if something goes wrong, then RETURN A STRING REPRESENTED BY NOTHING and EXIT and ignore the error.

  8. NO MATTER WHAT DID HAPPEND, compare the hash of the image with the hardcoded value "325352301465779383961442563121869825536".

Now we can easily understand that he value 320077571348578638765208875423858963980 from the error AssertionError: 320077571348578638765208875423858963980 != 325352301465779383961442563121869825536 is very probably the checksum of the empty string "". In the aim to fix this error, we have to: 1- see again the function StaticAnalysis.save_icon(ic.name) that might host the bug by ignoring the file and returning wrong any time. 2-make a more explicit message than logging.error('Unable to save the icon') and logging.error(e) such as file not found. 3-do not ignore the warning and exit the program to avoid future crash and make the user understand where the error comes from.

In the aim to help the developers to fix this error as soon as possible, I let a few informations about what happend before the crash:

1-an image from the constructor parameter of StaticAnalysis is given (page test_exodus_analyze). The image is then immediately loaded with the method .load_apk() of the class StaticAnalysis from the constructor (static analysis.py) and return a APK instance (from the file androguard/core/bytecodes/apk.py) 2-this instance is used again by the method def get_icon_phash(self): (page static_analysis.py line 369) 3-Then the error happend.

static_analysis.py line 380: the line logging.error('Unable to save the icon') triggers ..WARNING:root:Unable to get the icon from the APK -

and logging.error(e) probably triggers downloading from details (check the doc

https://docs.python.org/3/library/logging.html#logging.error :

 logging.error(msg, *args, **kwargs)

    Logs a message with level ERROR on the root logger.

)

And last but not least, in the second case, I met the same trouble using the same configuration you showed. Then your system configuration seems correct. The second problem comes from the programming of the software exodus privacy, not the administration of your system. This is our work to correct and update that. I will investigate to fix as soon as possible.

counter-reverse commented 5 years ago

Hello.

I solved the issue.

If you want to know more, fell free to read.

When we dig even deeper we can realize that the icon can be accessed by two ways over: -1 a google play client -2 an html parser

What is wrong is that the html parser actually get the url of the image with the size of the site instead of give the same url by a different way. We can realize when we manually use it that the url is a bit different.

For example the image of the application nextcloud here https://lh3.googleusercontent.com/7QOkrX_ydQddRFOR1KFRU3zQMKytCff8y-d64RBZKhY2ymJEwMektR5QFcaLrxj4jQ=s180 has an ending a bit different with "=s180" contrary to the url from the client that is only https://lh3.googleusercontent.com/7QOkrX_ydQddRFOR1KFRU3zQMKytCff8y-d64RBZKhY2ymJEwMektR5QFcaLrxj4jQ.

But do not worry, the last chars are optional and determine only the size of the image.

Then I implemented again the parser to ignore the chars after the equal including the equal.

Now images collected from different sources have now the same perceptive hash.

Now I just have to push to the repository.

Just note that use the hash of the greatest image (320077571348578638765208875423858963980) is more relevant than the hash of the tiniest (325352301465779383961442563121869825536).

Thank you. Best regards.

pnu-s commented 5 years ago

@FrancoisDupayrat The exodus-core module as it is won't work if the token stored on matlink.fr is not available. We could/should remove the hardcoded URL (as I already discussed with @Gu1nness ) but that wouldn't not solve the problem of local execution because the URL is used by the gplaycli module (stored in local configuration file).

pnu-s commented 3 years ago

Closing this as gplaycli is not used anymore by exodus-core