thp / urlwatch

Watch (parts of) webpages and get notified when something changes via e-mail, on your phone or via other means. Highly configurable.
https://thp.io/2008/urlwatch/
Other
2.82k stars 352 forks source link

ImportError: cannot import name 'html2text' from partially initialized module 'urlwatch.html2txt' (most likely due to a circular import) (/usr/lib/python3/dist-packages/urlwatch/html2txt.py) #601

Closed jpiszcz closed 3 years ago

jpiszcz commented 3 years ago

After a recent apt-get dist-upgrade on Debian x86_64, python has been updated to v3.9 from v3.8 and now:

Traceback (most recent call last): File "/usr/lib/python3/dist-packages/urlwatch/handler.py", line 113, in process data = FilterBase.process(filter_kind, subfilter, self, data) File "/usr/lib/python3/dist-packages/urlwatch/filters.py", line 146, in process return filtercls(state.job, state).filter(data, subfilter) File "/usr/lib/python3/dist-packages/urlwatch/filters.py", line 291, in filter from .html2txt import html2text ImportError: cannot import name 'html2text' from partially initialized module 'urlwatch.html2txt' (most likely due to a circular import) (/usr/lib/python3/dist-packages/urlwatch/html2txt.py)

.. $ python3 -m pip install html2text Requirement already satisfied: html2text in /usr/lib/python3/dist-packages (2020.1.16)

thp commented 3 years ago

I cannot reproduce this issue with Python 3.9.0, html2text 2020.1.16 and urlwatch 2.21:

% python3.9 -m pip list | egrep 'html2text|urlwatch'
html2text  2020.1.16
urlwatch   2.21

I used the following urls.yaml for testing:

name: test
url: https://thp.io/
filter:
  - html2text:
    method: pyhtml2text

And then testing the filter using:

python3.9 urlwatch --test-filter 1

Which version of urlwatch do you have installed? Can you diff your local file /usr/lib/python3/dist-packages/urlwatch/html2txt.py to the file in the Git repository to see if there are any differences there?

jpiszcz commented 3 years ago

Hello,

No differences reported: urlwatch-master$ diff -u lib/urlwatch/html2txt.py /usr/lib/python3/dist-packages/urlwatch/html2txt.py urlwatch-master$

$ dpkg -S /usr/lib/python3/dist-packages/urlwatch/html2txt.py urlwatch: /usr/lib/python3/dist-packages/urlwatch/html2txt.py

$ dpkg -l urlwatch Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-==============-============-============-================================= ii urlwatch 2.21-1 all monitors webpages for you

Here is the recipe/definition that is having an issue (only started after python was upgraded from 3.8->3.9)

---
name: "IAD Non-Stop Destinations"
url: "http://www.flydulles.com/iad/nonstop-destinations"
filter:
 - html2text
 - grep: 'iad_intl_*'
---
scottmac commented 3 years ago

I have some backtraces when I tracked this down in #559 with a previous version of python.

thp commented 3 years ago

@jpiszcz If it's the same issue as @scottmac mentioned, can you please check if the latest commit in the Git master branch fixes this?

jpiszcz commented 3 years ago

Yup—this error only happens once and awhile—I’ve updated my crontab to point to urlwatch from the Gitmaster—will see if the issue recurs.

From: Thomas Perl notifications@github.com Sent: Friday, December 11, 2020 4:54 AM To: thp/urlwatch urlwatch@noreply.github.com Cc: jpiszcz jpiszcz@lucidpixels.com; Mention mention@noreply.github.com Subject: Re: [thp/urlwatch] ImportError: cannot import name 'html2text' from partially initialized module 'urlwatch.html2txt' (most likely due to a circular import) (/usr/lib/python3/dist-packages/urlwatch/html2txt.py) (#601)

@jpiszcz https://github.com/jpiszcz If it's the same issue as @scottmac https://github.com/scottmac mentioned, can you please check if the latest commit in the Git master branch fixes this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/thp/urlwatch/issues/601#issuecomment-743095146 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ABT5QWEFI2OJY32B3ICN26DSUHT3ZANCNFSM4UUFZGFA . https://github.com/notifications/beacon/ABT5QWCJEYZEYOFBIC4PF3DSUHT3ZA5CNFSM4UUFZGFKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOFRFLW2Q.gif

polyzen commented 3 years ago

Getting this on Arch Linux ARM now that the Python 3.9 rebuild has made it over. Applied the patch to 2.21, and will report back if it happens again.

jpiszcz commented 3 years ago

As of December 14, the error has yet to recur using the latest commit in Git master.

thp commented 3 years ago

Since it has been some days since December 14, I'm going to close this for now. Please feel free to reopen/comment if this happens again.