ip-tools / patzilla

PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
https://docs.ip-tools.org/patzilla/
GNU Affero General Public License v3.0
97 stars 21 forks source link

Problems installing in sandbox mode, and migration to Python 3 #68

Open papoteur-mga opened 1 year ago

papoteur-mga commented 1 year ago

I'm trying to launch Patzilla in sandbox mode. However, when I connect the page http://localhost:6543/navigator/ I get a blank page. In debug console, I found:

Uncaught Error: Module build failed (from /home/yves/patzilla/patzilla/node_modules/sass-loader/lib/loader.js):
Error: Missing binding /home/yves/patzilla/patzilla/node_modules/node-sass/vendor/linux-x64-83/binding.node
Node Sass could not find a binding for your current environment: Linux 64-bit with Node.js 14.x

Found bindings for the following environments:
  - Linux 64-bit with Node.js 11.x

This usually happens because your environment has changed since running `npm install`.
Run `npm rebuild node-sass` to download the binding for your current environment.

However, the file /home/yves/patzilla/patzilla/node_modules/node-sass/vendor/linux-x64-83/binding.node exists. On Internet, I found multiple answer to rebuild node-sass module, but after tenth rebuild, I convinced that something else is the cause. I have system node at version 14.25.3, thus I don't understand why it says Found bindings for the following environments: - Linux 64-bit with Node.js 11.x

amotl commented 1 year ago

Dear Yves,

thank you for writing in. The current version of PatZilla is a bit old, still building upon Python 2 and Node.js 11 ^1. I think you will only be successful with the corresponding software versions in your environment. However, on the Node.js side, it is pretty easy by using Supernode, as suggested.

On the other hand, I think it will also be time to finally provide OCI images for consumption. Let me know if you are not successful, then I can work on this.

With kind regards, Andreas.

papoteur-mga commented 1 year ago

Thanks Andreas for your quick answer. My problem was that I got a mix between Python2 and Python3, with 2 virtualenv spaces, .venv and .venv2 Now, I go further. But:

http://localhost:6543/static/assets/commons.bundle.js
[HTTP/1.1 404 Not Found 19ms]

I have not yet looked for a solution. My idea is to add the summary export. But the Python3 migration is probably the first step.

amotl commented 1 year ago

Hi again,

feel free to keep this issue open until we've resolved the JavaScript woes.

[I get] 404 Not Found for /static/assets/commons.bundle.js.

Does running yarn build make any difference for you?

The Python3 migration is probably the first step.

That is dearly needed, right. Because PatZilla also lacked a comprehensive test suite, which will tremendously help with the migration, I started to write test cases a few months ago.

Also, there are a few branches which already made progress on the Python migration: There is the python3 branch, which has a few things, but there is also the peds branch, which is already running on Python 3, but only covers a certain amount of functionality I needed for a recent project.

I will try to get back to working on this, and look forward to catch some time for that in April.

With kind regards, Andreas.

papoteur-mga commented 1 year ago

[I get] 404 Not Found for /static/assets/commons.bundle.js.

Does running yarn build make any difference for you?

Yes, it did !

The Python3 migration is probably the first step.

That is dearly needed, right. Because PatZilla also lacked a comprehensive test suite, which will tremendously help with the migration, I started to write test cases a few months ago.

Also, there are a few branches which already made progress on the Python migration: There is the python3 branch, which has a few things, but there is also the peds branch, which is already running on Python 3, but only covers a certain amount of functionality I needed for a recent project.

Interesting. I will have a look to these branches. I know Python, but not really the Web technology.

papoteur-mga commented 1 year ago

I still have this error. I presume that this is linked to the MongoDB version.

Reason: pymongo.errors.OperationFailure: Unsupported OP_QUERY command: delete. The client driver may require an upgrade. For more details see https://dochub.mongodb.org/core/legacy-opcode-removal

I have the latest, 6.0.5

amotl commented 1 year ago

Dear Yves,

I can confirm the error you are observing with MongoDB 6. I will update the documentation and sandbox tooling to use a lower version. Thanks s stack for the report!

With kind regards, Andreas.

amotl commented 1 year ago

Hi again,

I've just added cca06a96ee, which will use Docker to run MongoDB 5 when invoking make mongodb. Let me know if this helps, or if you can observe further problems.

With kind regards, Andreas.

amotl commented 1 year ago

Hi Yves,

have you been lucky running PatZilla in sandbox mode with MongoDB 5? Otherwise, please let me know if you observe further errors.

With kind regards, Andreas.

papoteur-mga commented 1 year ago

Hi Andreas, I installed a version 4, and it works. I have not yet tried the version 5, however. I'm working on the Python3 migration. I have already committed some of the fixes on my fork. I try to have the tests to pass. One of them is failing because USPTO. Is configuration and/or anthentication needed?

wget https://ppubs.uspto.gov/dirsearch-public/image-conversion/convert?url=us-pgpub/US/2014/0071/638/00000002.tif
--2023-03-24 18:08:43--  https://ppubs.uspto.gov/dirsearch-public/image-conversion/convert?url=us-pgpub/US/2014/0071/638/00000002.tif
Résolution de ppubs.uspto.gov (ppubs.uspto.gov)… 18.213.5.133, 54.152.196.200
Connexion à ppubs.uspto.gov (ppubs.uspto.gov)|18.213.5.133|:443… connecté.
requête HTTP transmise, en attente de la réponse… 403 Forbidden
amotl commented 1 year ago

Hi Yves,

USPTO tests are failing. Is configuration and/or anthentication needed?

I've recently also recognized the problem, it is about WAF blocking, see https://github.com/ip-tools/patzilla/issues/61#issuecomment-1475408168. So, I've just deactivated the USPTO-related tests with 5fa5107556 without further ado.

I'm working on the Python3 migration. I have already committed some of the fixes on my fork. I try to have the tests to pass.

Sweet. Many thanks. Will you already take the other branches into consideration? Maybe I should rush them in first, and then you can rebase your work on top of that incrementally, and take care about the missing gaps?

With kind regards, Andreas.

amotl commented 1 year ago

Please also see ^1 when working on the Python 3 migration, you will need it. I will see what I can do to wrap this into a package on PyPI, to make integration seamless again.

papoteur-mga commented 1 year ago

For the migration, I started from the last state of the main branch, then applied the patches from the python3 branch. Thus the last commits from the branch peds are not used, as I see that the first ones was already in the main branch. Tests are quite in a good shape. I still have problems with ModuleNotFoundError: No module named '__builtin__' in mock component when calling tests/util/test_numbers_helper.py:21

I still have a problem that the tabs Desc, ... are not displayed. This is already the case with the Python2 state. I had a look in the browser's console, there is no error in javascript, and I can see that data are collected, but not displayed. I don't know where to look at.

amotl commented 1 year ago

Tests are quite in a good shape.

Excellent!

I still have problems with ModuleNotFoundError: No module named '__builtin__' in mock component when calling tests/util/test_numbers_helper.py:21

I see. Don't know where this is coming from, but I may have a look.

I still have a problem that the tabs Desc, ... are not displayed. This is already the case with the Python2 state.

Oh. Yes, that probably has nothing to do with Python2. I will have look, maybe tomorrow.

amotl commented 1 year ago

FYI: You can unlock a few more test cases by exporting the OPS_API_CONSUMER_KEY and OPS_API_CONSUMER_SECRET environment variables with valid values in your terminal, and then running pytest -k epo. Cheers!

papoteur-mga commented 1 year ago

I committed the state of my work in my repo. There is still a test not passing. This in doctest of 05_misc.rst, failing requests to CQL give a full stack trace, which wasn't expected. Example:

052 >>> CQL('(((foobar)))').dumps()
053 '(((foobar)))'
054 
055 
056 Queries with errors
057 ===================
058 
059 Nonsense
060 --------
061 >>> CQL('foo bar', logging=False).dumps()
Differences (unified diff with -expected +actual):
    @@ -1,3 +1,12 @@
     Traceback (most recent call last):
    -    ...
    -ParseException: Expected end of text, found 'bar'  (at char 4), (line:1, col:5)
    +  File "/usr/lib64/python3.8/doctest.py", line 1336, in __run
    +    exec(compile(example.source, filename, "single",
    +  File "<doctest 05_misc.rst[8]>", line 1, in <module>
    +    CQL('foo bar', logging=False).dumps()
    +  File "/home/yves/patzilla3/patzilla/patzilla/util/cql/pyparsing/__init__.py", line 18, in __init__
    +    self.tokens = self.parse()
    +  File "/home/yves/patzilla3/patzilla/patzilla/util/cql/pyparsing/__init__.py", line 59, in parse
    +    tokens = self.grammar().parser.parseString(self.cql, parseAll=True)
    +  File "/home/yves/patzilla3/patzilla/.venv3/lib/python3.8/site-packages/pyparsing/core.py", line 1141, in parse_string
    +    raise exc.with_traceback(None)
    +pyparsing.exceptions.ParseException: Expected end of text, found 'bar'  (at char 4), (line:1, col:5)

/home/yves/patzilla3/patzilla/patzilla/util/cql/pyparsing/test/05_misc.rst:61: DocTestFailure

I have pyparsing in 3.0.9.

Error with builtins is fixed.

I have warnings :

 .venv3/lib/python3.8/site-packages/mongoengine/base/document.py:449: DeprecationWarning: No 'json_options' are specified! Falling back to LEGACY_JSON_OPTIONS with uuid_representation=PYTHON_LEGACY. For use with other MongoDB drivers specify the UUID representation to use. This will be changed to uuid_representation=UNSPECIFIED in a future release.

I don't understand if the fix has to be in mongoengine or in calling it.

amotl commented 1 year ago

I committed the state of my work in my repo.

Very sweet, thank you! Will you submit a pull request? Then, I can look into the remaining errors, and maybe add a few fixes to the patch. Independently, I will look into the JavaScript problem at the document details chooser for navigating to Description and Claims.

amotl commented 1 year ago

Independently, I will look into the JavaScript problem at the document details chooser for navigating to Description and Claims.

I've just created GH-69 to analyze and track this problem. Do my observations reported there actually match yours?

amotl commented 1 year ago

Hi again,

I still have a problem that the tabs Desc, ... are not displayed. This is already the case with the Python2 state. I had a look in the browser's console, there is no error in javascript, and I can see that data are collected, but not displayed. I don't know where to look at.

0933634524 has a corresponding fix, so GH-69 has already been resolved. The problem was very subtle, as it did not produce any error message.

With kind regards, Andreas.