mrmap-community / GeoPortal.rlp

Other
7 stars 2 forks source link

Error found while loading News (Meldungen), Informationen->Über Uns and others #40

Open dpakprajul opened 10 months ago

dpakprajul commented 10 months ago

Description:

I encountered an error while trying to load the "Meldungen" page. The page loads but the following error is present in the apache error.log file: [Mon Nov 20 15:51:21.230547 2023] [wsgi:error] [pid 1570133] Exception in thread Thread-53: [Mon Nov 20 15:51:21.230786 2023] [wsgi:error] [pid 1570133] Traceback (most recent call last): [Mon Nov 20 15:51:21.230884 2023] [wsgi:error] [pid 1570133] File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner [Mon Nov 20 15:51:21.231176 2023] [wsgi:error] [pid 1570133] self.run() [Mon Nov 20 15:51:21.231300 2023] [wsgi:error] [pid 1570133] File "/usr/lib/python3.9/threading.py", line 892, in run [Mon Nov 20 15:51:21.231532 2023] [wsgi:error] [pid 1570133] self._target(*self._args, **self._kwargs) [Mon Nov 20 15:51:21.231603 2023] [wsgi:error] [pid 1570133] File "/data/GeoPortal.rlp/useroperations/utils/useroperations_helper.py", line 39, in __set_tag [Mon Nov 20 15:51:21.231744 2023] [wsgi:error] [pid 1570133] if searcher.is_article_internal(title): [Mon Nov 20 15:51:21.231849 2023] [wsgi:error] [pid 1570133] File "/data/GeoPortal.rlp/searchCatalogue/utils/searcher.py", line 401, in is_article_internal [Mon Nov 20 15:51:21.232061 2023] [wsgi:error] [pid 1570133] resp = self.get_info_result_category(tmp) [Mon Nov 20 15:51:21.232155 2023] [wsgi:error] [pid 1570133] File "/data/GeoPortal.rlp/searchCatalogue/utils/searcher.py", line 386, in get_info_result_category [Mon Nov 20 15:51:21.232321 2023] [wsgi:error] [pid 1570133] response = response["query"]["pages"] [Mon Nov 20 15:51:21.232380 2023] [wsgi:error] [pid 1570133] KeyError: 'query'

Steps to reproduce:

  1. Navigate to the "Meldungen" page. For example: https://www.geoportal.hessen.de/article/Meldungen/
  2. Or go to Informationen and go to the sub categories like Über uns, GDI-Hessen-
  3. See in the error log file.

Problem:

The code mightn't work what it is intended for. Hence it could be made error free, even though no error could be seen in frontend.

Solving approach:

In the Geoportal.rlp/useroperations/utils/useroperations_helper.py:37, the title is empty string (always) when the Meldungen page is loaded. The expected name of the title should be Meldungen. Hence it passes the empty title in the searcher in line 39:

if searcher.is_article_internal(title):
                attrib = "/article/" + title

and the title ultimately passed to resp = self.get_info_result_category(tmp) in searchCatalouge/utils/searcher.py which makes a bad response and also doesnot include ["query"]["pages"]. The code from line 37 to 42 in useroperations/utils/useroperations_helper.py was replaced with:

for elem in _list:
        attrib = elem.get(attribute)
        if tag == 'a':
            # Parse the href attribute as a URL
            url = urlparse(elem.get('href', ''))

            # Extract the query parameters from the URL
            query_params = parse_qs(url.query)

            # Get the "title" query parameter
            title = query_params.get('title', [''])[0].replace(' ', '_')

            if title and searcher.is_article_internal(title):
                attrib = "/article/" + title
        if protocol not in attrib:
            elem.set(attribute, prefix + attrib)

It removes the error in the error log file, but I have have doubt this new lines of code can serve its purpose.

@holsandre

Question:

What is the purpose of these code in line searchCatalogue/utils/searcher.py:390-405 and in get_info_result_category(tmp)? Is it serving its purpose?

holsandre commented 9 months ago

What is the purpose of these code in line searchCatalogue/utils/searcher.py:390-405 and in get_info_result_category(tmp)? Is it serving its purpose?

I cant tell 100% because the code is from a former colleague, but it has something to do with mediawiki articles that have the category "Portalseite" are rendered internally in the info search and/or maybe in other places as well:

grafik

Not sure if this has ever worked as expected.

I used a part of your code for commit, should be enough to get rid of the error: https://github.com/mrmap-community/GeoPortal.rlp/commit/9926d88e8443e20ee2050a2da4ac32e061f22a9f

karlbrink commented 8 months ago

After this fix, the links in the "table of contents" in every mediawiki article is not functianal any more.