the-qa-company / WikibaseSync

Library to copy entities from one Wikibase to another and to keep them in sync
MIT License
30 stars 8 forks source link

An example for importing properties with constraints #4

Open shigapov opened 2 years ago

shigapov commented 2 years ago

Hi Dennis, I managed to run import_one.py and import_list.py and to import entities with multilingual labels and external IDs. Now I would like to add the constraints to properties, how can I make it?

D063520 commented 2 years ago

Hi! So the properties are there, no? or do you mean to make the plugin work?

shigapov commented 2 years ago

The properties contain only labels and "Wikidata PID". I thought that the local equivalent of P2302 is also created for all imported properties, no?

The extension WikibaseQualityConstraints is installed.

D063520 commented 2 years ago

When you import an entity then all related properties and statements are imported. The outgoing entities and properties are only imported as leaves. If you want all properties of an outgoing object or a property you need to run the import one again. BTW, can we have a chat once? (otherwise we would impirt recursivly all Wikidata ; ))

shigapov commented 2 years ago

Hm, I started with a clean Wikibase instance, and wanted to create just one property like this:

python3 import_one.py P31
WARNING: /.local/lib/python3.10/site-packages/rdflib_jsonld/__init__.py:9: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.1.  Please remove rdflib-jsonld from your project's dependencies.
  warnings.warn(
Traceback (most recent call last):
  File "/WikibaseSync/import_one.py", line 31, in <module>
    wikibase_importer.change_property(wikidata_property, wikibase_repo, True)
  File "/WikibaseSync/util/util.py", line 1110, in change_property
    self.changeClaims(wikidata_item, wikibase_item)
  File "/WikibaseSync/util/util.py", line 865, in changeClaims
    for wikibase_claims in wikibase_item.claims:
AttributeError: 'NoneType' object has no attribute 'claims'
CRITICAL: Exiting due to uncaught exception <class 'AttributeError'>

My use case is just to create all properties with constraints. In principle, for that I need to create a list with all Wikidata PIDs in the list-file and then to run python import_list.py, right?

P.S. Sure, let's agree on time via LI.

D063520 commented 2 years ago

can you run the script a second time?

shigapov commented 2 years ago

On a clean instance after the second run of import_one.py I have:

python3 import_one.py P31
WARNING: /.local/lib/python3.10/site-packages/rdflib_jsonld/__init__.py:9: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.1.  Please remove rdflib-jsonld from your project's dependencies.
  warnings.warn(

WARNING: API error modification-failed: Property [[Property:P1|P1]] already has label "Wikidata PID" associated with language code en.
Importing P31
Change Property P31
Import Property P31 from Wikidata
WARNING: API error modification-failed: Property [[Property:P2|P2]] already has label "instance of" associated with language code en.
Could not set description of  -1
Edit to page [[my:Property:-1]] failed:
modification-failed: Property [[Property:P2|P2]] already has label "instance of" associated with language code en.
[messages: [{'name': 'wikibase-validator-label-conflict', 'parameters': ['instance of', 'en', '[[Property:P2|P2]]'], 'html': {'*': 'Property <a href="/wiki/Property:P2" title="Property:P2">P2</a> already has label "instance of" associated with language code en.'}}];
 help: See http://localhost/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt; for notice of API deprecations and breaking changes.]
This should not happen 6
Error probably property or item already existing  Edit to page [[my:Property:-1]] failed:
modification-failed: Property [[Property:P2|P2]] already has label "instance of" associated with language code en.
[messages: [{'name': 'wikibase-validator-label-conflict', 'parameters': ['instance of', 'en', '[[Property:P2|P2]]'], 'html': {'*': 'Property <a href="/wiki/Property:P2" title="Property:P2">P2</a> already has label "instance of" associated with language code en.'}}];
 help: See http://localhost/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt; for notice of API deprecations and breaking changes.]
Traceback (most recent call last):
  File "/WikibaseSync/import_one.py", line 31, in <module>
    wikibase_importer.change_property(wikidata_property, wikibase_repo, True)
  File "/WikibaseSync/util/util.py", line 1110, in change_property
    self.changeClaims(wikidata_item, wikibase_item)
  File "/WikibaseSync/util/util.py", line 865, in changeClaims
    for wikibase_claims in wikibase_item.claims:
AttributeError: 'NoneType' object has no attribute 'claims'
CRITICAL: Exiting due to uncaught exception <class 'AttributeError'>
D063520 commented 2 years ago

I'm trying to reproduce it ... did you use https://www.mediawiki.org/wiki/Wikibase/Docker?

shigapov commented 2 years ago

Yes, the latest release https://github.com/wmde/wikibase-release-pipeline/tree/wmde.2 within https://github.com/UB-Mannheim/RaiseWikibase.

D063520 commented 2 years ago

As you saw I run into this:

https://github.com/wmde/wikibase-release-pipeline/issues/265

I will wait a bit to see if someone has an idea on how to fix it ... passed my morning trying to make it work. Had never a problem with the old version ....

shigapov commented 2 years ago

Yes, a weird problem.

phucty commented 2 years ago

Regarding the warning of "Property [[Property:P2|P2]] already has label "instance of" associated with language code en." --> It is better to search property names instead of getting error while adding a new property. https://github.com/the-qa-company/WikibaseSync/blob/6d04d6a0990443aee163de8c5775808a592d40a6/util/PropertyWikidataIdentifier.py#L23

--> This will create a new property ID and remove it. At the end, the property ID number will be very big. like this one

Regarding: Traceback (most recent call last): File "/WikibaseSync/import_one.py", line 31, in <module> wikibase_importer.change_property(wikidata_property, wikibase_repo, True) File "/WikibaseSync/util/util.py", line 1110, in change_property self.changeClaims(wikidata_item, wikibase_item) File "/WikibaseSync/util/util.py", line 865, in changeClaims for wikibase_claims in wikibase_item.claims: AttributeError: 'NoneType' object has no attribute 'claims' CRITICAL: Exiting due to uncaught exception <class 'AttributeError'>

It seems that there was an imported problem of wikibase_item in this function: https://github.com/the-qa-company/WikibaseSync/blob/6d04d6a0990443aee163de8c5775808a592d40a6/util/util.py#L1057 Maybe you have to debug to see what was happened.

I also got this problem, then, I found that the error is in my config/application.config.ini. If we use the default setting of Wikibase Docker, it might be wikibase.svc as the prefix of entityUri and propertyUri

D063520 commented 2 years ago

@shigapov : could you check, I pushed a fix. Also the config files are more aligned to the latest release

shigapov commented 2 years ago

I ran python3 import_one.py P31 and it created the property with everything in it, very nice, thank you!

But then I wanted to create yet another property and it has failed:

python3 import_one.py P281
WARNING: /.local/lib/python3.10/site-packages/rdflib_jsonld/__init__.py:9: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.1.  Please remove rdflib-jsonld from your project's dependencies.
  warnings.warn(

WARNING: API error modification-failed: Property [[Property:P1|P1]] already has label "Wikidata PID" associated with language code en.
Importing P281
Change Property P281
Import Property P281 from Wikidata
claimsToRemove  []
P1855
False
Import Property P1855 from Wikidata
WARNING: API error modification-failed: Property [[Property:P4|P4]] already has label "Wikidata property example" associated with language code en.
Could not set description of  -1
Edit to page [[my:Property:-1]] failed:
modification-failed: Property [[Property:P4|P4]] already has label "Wikidata property example" associated with language code en.
[messages: [{'name': 'wikibase-validator-label-conflict', 'parameters': ['Wikidata property example', 'en', '[[Property:P4|P4]]'], 'html': {'*': 'Property <a href="/wiki/Property:P4" title="Property:P4">P4</a> already has label "Wikidata property example" associated with language code en.'}}];
 help: See http://localhost/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt; for notice of API deprecations and breaking changes.]
This should not happen 6
Error probably property or item already existing  Edit to page [[my:Property:-1]] failed:
modification-failed: Property [[Property:P4|P4]] already has label "Wikidata property example" associated with language code en.
[messages: [{'name': 'wikibase-validator-label-conflict', 'parameters': ['Wikidata property example', 'en', '[[Property:P4|P4]]'], 'html': {'*': 'Property <a href="/wiki/Property:P4" title="Property:P4">P4</a> already has label "Wikidata property example" associated with language code en.'}}];
 help: See http://localhost/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt; for notice of API deprecations and breaking changes.]
Import Entity Q70 from Wikidata
Traceback (most recent call last):
  File "/WikibaseSync/import_one.py", line 31, in <module>
    wikibase_importer.change_property(wikidata_property, wikibase_repo, True)
  File "/WikibaseSync/util/util.py", line 1112, in change_property
    self.changeClaims(wikidata_item, wikibase_item)
  File "/WikibaseSync/util/util.py", line 992, in changeClaims
    claim = self.translateClaim(wikidata_claim.get('mainsnak'))
  File "/WikibaseSync/util/util.py", line 551, in translateClaim
    claim = pywikibot.Claim(self.wikibase_repo, self.id.get_id(wikidata_propertyId), datatype='wikibase-item')
  File "/WikibaseSync/util/IdSparql.py", line 52, in get_id
    return self.mapProperty[id]
KeyError: 'P1855'
CRITICAL: Exiting due to uncaught exception <class 'KeyError'>
D063520 commented 2 years ago

@shigapov new fix .... this should not have happened ... the problem is that the version that we have in production is not fully aligned to this one. I will try to move this one to production so that we can contribute better to the repo.

Did you execute the imports above in a test instance?

D063520 commented 2 years ago

@shigapov did you find time to test it?

shigapov commented 2 years ago

Hi @D063520! I've just repeated the experiment on a fresh Wikibase instance 1.36 with your updates from 20.12.2021. I've got the same error message as explained in https://github.com/the-qa-company/WikibaseSync/issues/4#issuecomment-996523763. So I ran python3 import_one.py P31 and then python3 import_one.py P281. The first command worked excellent, and the second one has failed.

mezez commented 2 years ago

I ran python3 import_one.py P31 and it created the property with everything in it, very nice, thank you!

But then I wanted to create yet another property and it has failed:

python3 import_one.py P281
WARNING: /.local/lib/python3.10/site-packages/rdflib_jsonld/__init__.py:9: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.1.  Please remove rdflib-jsonld from your project's dependencies.
  warnings.warn(

WARNING: API error modification-failed: Property [[Property:P1|P1]] already has label "Wikidata PID" associated with language code en.
Importing P281
Change Property P281
Import Property P281 from Wikidata
claimsToRemove  []
P1855
False
Import Property P1855 from Wikidata
WARNING: API error modification-failed: Property [[Property:P4|P4]] already has label "Wikidata property example" associated with language code en.
Could not set description of  -1
Edit to page [[my:Property:-1]] failed:
modification-failed: Property [[Property:P4|P4]] already has label "Wikidata property example" associated with language code en.
[messages: [{'name': 'wikibase-validator-label-conflict', 'parameters': ['Wikidata property example', 'en', '[[Property:P4|P4]]'], 'html': {'*': 'Property <a href="/wiki/Property:P4" title="Property:P4">P4</a> already has label "Wikidata property example" associated with language code en.'}}];
 help: See http://localhost/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt; for notice of API deprecations and breaking changes.]
This should not happen 6
Error probably property or item already existing  Edit to page [[my:Property:-1]] failed:
modification-failed: Property [[Property:P4|P4]] already has label "Wikidata property example" associated with language code en.
[messages: [{'name': 'wikibase-validator-label-conflict', 'parameters': ['Wikidata property example', 'en', '[[Property:P4|P4]]'], 'html': {'*': 'Property <a href="/wiki/Property:P4" title="Property:P4">P4</a> already has label "Wikidata property example" associated with language code en.'}}];
 help: See http://localhost/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt; for notice of API deprecations and breaking changes.]
Import Entity Q70 from Wikidata
Traceback (most recent call last):
  File "/WikibaseSync/import_one.py", line 31, in <module>
    wikibase_importer.change_property(wikidata_property, wikibase_repo, True)
  File "/WikibaseSync/util/util.py", line 1112, in change_property
    self.changeClaims(wikidata_item, wikibase_item)
  File "/WikibaseSync/util/util.py", line 992, in changeClaims
    claim = self.translateClaim(wikidata_claim.get('mainsnak'))
  File "/WikibaseSync/util/util.py", line 551, in translateClaim
    claim = pywikibot.Claim(self.wikibase_repo, self.id.get_id(wikidata_propertyId), datatype='wikibase-item')
  File "/WikibaseSync/util/IdSparql.py", line 52, in get_id
    return self.mapProperty[id]
KeyError: 'P1855'
CRITICAL: Exiting due to uncaught exception <class 'KeyError'>

On a fresh install, I tried to reproduce this,

First with python import_one.py P31

Second with python import_one.py P281

Both imports worked successfully

D063520 commented 2 years ago

@shigapov: sorry, could you try again and check if it is exactly the same error? Can it be you run into this issue https://github.com/the-qa-company/WikibaseSync/issues/6?

D063520 commented 2 years ago

It could be related also to this: https://github.com/the-qa-company/WikibaseSync/issues/15