benbusby / whoogle-search

A self-hosted, ad-free, privacy-respecting metasearch engine
https://pypi.org/project/whoogle-search/
MIT License
9.36k stars 925 forks source link

[BUG] results.py:100: SyntaxWarning: invalid escape sequence '\|' #1144

Open glitsj16 opened 4 months ago

glitsj16 commented 4 months ago

Describe the bug Getting the below warning on Arch Linux with Python 3.12:

Apr 27 18:53:23 lab16 whoogle[4699: /home/glitsj16/whoogle-search/app/utils/results.py:100: SyntaxWarning: invalid escape sequence '\|'
Apr 27 18:53:23 lab16 whoogle[4699:   if re.match('.*[@_!#$%^&*()<>?/\|}{~:].*', target_word) or (

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Deployment Method

Version of Whoogle Search

Desktop (please complete the following information):

Smartphone (please complete the following information):

Additional context Add any other context about the problem here.

$ pacman -Q python python 3.12.3-1

This small patch fixes the warning for me, but this is python 3.12 so there's probably a better way to deal with the warning:

UPDATE: This is due to the fact that Python 3.12.3 does SyntaxWarning instead of DeprecationWarning for invalid backslash escape sequences:

Invalid backslash escape sequences in strings now warn with SyntaxWarning instead of DeprecationWarning, making them more visible. (They will become syntax errors in the future.)

Using a raw string for matching regex fixes things:

$ cat syntaxwarning.patch
--- a/app/utils/results.py
+++ b/app/utils/results.py
@@ -97,7 +97,7 @@
         else:
             reg_pattern = fr'\b((?![{{}}<>-]){target_word}(?![{{}}<>-]))\b'

-        if re.match('.*[@_!#$%^&*()<>?/\|}{~:].*', target_word) or (
+        if re.match(r'.*[@_!#$%^&*()<>?/\|}{~:].*', target_word) or (
                 element.parent and element.parent.name == 'style'):
             return

HTH

davidthewatson commented 2 weeks ago

Confirmed fix on Garuda Linux with whoogle installed via AUR.

Before patch:

     ╭─watson@acer in /opt/whoogle-search via  v3.12.5 (venv) as 🧙 took 15ms
     ╰─λ ./run 
    Running on http://0.0.0.0:5000
    /opt/whoogle-search/app/utils/results.py:100: SyntaxWarning: invalid escape sequence '\|'
      if re.match('.*[@_!#$%^&*()<>?/\|}{~:].*', target_word) or (
    Exception in thread Thread-1 (gen_bangs_json):
    Traceback (most recent call last):
      File "/usr/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
        self.run()
      File "/usr/lib/python3.12/threading.py", line 1012, in run
        self._target(*self._args, **self._kwargs)
      File "/opt/whoogle-search/app/utils/bangs.py", line 26, in gen_bangs_json
        data = json.loads(r.text)
               ^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/json/__init__.py", line 346, in loads
        return _default_decoder.decode(s)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/json/decoder.py", line 337, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/json/decoder.py", line 355, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
    t/whoogle-search/app/utils/results.py:100: SyntaxWarning: invalid escape sequence '\|'
      if re.match('.*[@_!#$%^&*()<>?/\|}{~:].*', target_word) or (
    Exception in thread Thread-1 (gen_bangs_json):
    Traceback (most recent call last):
      File "/usr/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
        self.run()
      File "/usr/lib/python3.12/threading.py", line 1012, in run
        self._target(*self._args, **self._kwargs)
      File "/opt/whoogle-search/app/utils/bangs.py", line 26, in gen_bangs_json
        data = json.loads(r.text)
               ^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/json/__init__.py", line 346, in loads
        return _default_decoder.decode(s)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/json/decoder.py", line 337, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/json/decoder.py", line 355, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: lin
    ^C⏎                                                                                         

After patch:

     ╭─watson@acer in /opt/whoogle-search via  v3.12.5 (venv) as 🧙 took 1m58s
     ╰─λ ./run
    Running on http://0.0.0.0:5000
    ^C⏎                                 

Version:

     ╭─watson@acer in /opt/whoogle-search via  v3.12.5 (venv) as 🧙 took 231ms
     ╰─λ pamac search whoogle
    whoogle  0.8.4-1.1 [Installed]                                                   chaotic-aur
        A self-hosted, ad-free, privacy-respecting metasearch engine

Thanks for the fix!