Open div927 opened 2 years ago
hi @div927 could you please tell the python version and versions of the relevant packages: extruct, rdflib, rdflib-jsonld, pyrdfa3.
@lopuhin python version -> 3.6.9 extruct (0.13.0) rdflib (5.0.0) rdflib-jsonld (0.6.2) pyRdfa3 (3.5.3)
@div927 I see, there were some incompatible changes in latest rdflib
versions, if I go with latest versions for everything, it works for me with python 3.9, so in you case I hope updating rdflib to 6.0.2 and rdflib-jsonld to 0.6.2 should fix the issue.
Probably we should check which versions don't work and add constraints.
@lopuhin I don't think python 3.6.9 can install rdflib to 6.0.2 and rdflib-jsonld to 0.6.2 because when I try I didn't working for me.
Collecting rdflib==6.0.2 Could not find a version that satisfies the requirement rdflib==6.0.2 (from versions: 2.4.1, 2.4.2, 3.0.0, 3.1.0, 3.2.0, 3.2.1, 3.2.2, 3.2.3, 3.4.0, 4.0, 4.0.1, 4.1.0, 4.1.1, 4.1.2, 4.2.0, 4.2.1, 4.2.2, 5.0.0rc1, 5.0.0) No matching distribution found for rdflib==6.0.2
@div927 oh sorry, my bad - I misread and was checking with python 3.9. Actually we have the same problem with the build here https://github.com/scrapinghub/extruct/runs/3745270289?check_suite_focus=true - let me check if there is some working configuration. Unfortunately old build logs are no longer available. Worst case, downgrading extract should work, and extraction quality and API should be pretty similar.
@lopuhin what version of extruct is compatible with python 3.6.9. If in case have to downgrade it.
@div927 aha here is the issue: https://pypi.org/project/rdflib-jsonld/ says that
If you are forced to keep using Python <= 3.6, you will need to keep using release <= 0.5.0 of this plugin with RDFlib 5.0.0.
So if you downgrade rdflib-jsonld
to 0.5.0 then it works - I checked with python 3.6
actually https://github.com/scrapinghub/extruct/pull/182 already puts that constraints in place, so let us try to finish it (there was another build issue there)
@lopuhin yes !
data = extruct.extract(r.text, base_url=base_url) /Users/divyanshu/flask/lib/python3.6/site-packages/rdflib_jsonld/__init__.py:12: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.1. Please remove rdflib-jsonld from your project's dependencies. DeprecationWarning, Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/divyanshu/flask/lib/python3.6/site-packages/extruct/_extruct.py", line 108, in extract output[syntax] = list(extract(document, base_url=base_url)) File "/Users/divyanshu/flask/lib/python3.6/site-packages/extruct/rdfa.py", line 154, in extract_items jsonld_string = g.serialize(format='json-ld', auto_compact=not expanded) File "/Users/divyanshu/flask/lib/python3.6/site-packages/rdflib/graph.py", line 961, in serialize serializer = plugin.get(format, Serializer)(self) File "/Users/divyanshu/flask/lib/python3.6/site-packages/rdflib/plugin.py", line 107, in get return p.getClass() File "/Users/divyanshu/flask/lib/python3.6/site-packages/rdflib/plugin.py", line 84, in getClass self._class = self.ep.load() File "/Users/divyanshu/flask/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2322, in load return self.resolve() File "/Users/divyanshu/flask/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2328, in resolve module = __import__(self.module_name, fromlist=['__name__'], level=0) ModuleNotFoundError: No module named 'rdflib_jsonld.serializer'