Open wetneb opened 4 years ago
Yes, that would be great. And we should be able to link the column for the fetch data in the WD schema for pushing back data.
I am not sure what you mean by "link the column". Do you mean using column groups? I don't see how column groups can be relied on in the WD schema.
What I meant is that if I could query quantifiers and references, than, they can also be push back. This makes a round trip (get the data, fill the blanks, push the data back).
Now, this can't be done since quantifiers and references can't be imported before.
Would this be why I'm having this issue? Sorry if the terminology is off -- perhaps I should have said "qualifier" instead of "flag" in the subject line…
No your issue is not linked to qualifiers - but it's also an interesting one, I replied there :)
Use case mentioned here by @mshd:
I would like to reconcile Wikidata with a certain qualifier. Is it that possible, if not, could you implement it?
Exampl
Set qualifier property to North Sumatera III. or give me all people which ever had a candidacy at this district.
I would love it. In my usecase I have annual data like "total revenue" and without fetching qualifiers it's really difficult to update only those with no data from a certain year.
Let me expand on the design questions that need to be resolved before this can be implemented. This issue can be understood in multiple ways:
Possible syntaxes we could add to support these use cases (where P3602
is candidacy in election, P1111
is votes received and P768
is electoral district):
P3602#P1111
(all P1111
qualifiers on all P3602
statements)P3602=Q108816797#P1111
(all P1111
qualifiers on P3602=Q108816797
statements)P3602[P768=Q96984689]
(all main statement values on P3602
statements with P768=Q96984689
qualifier)Do you see other use cases not covered by these points? Which of those use cases would be useful to you?
Do you see other use cases not covered by these points? Which of those use cases would be useful to you?
Looks good to me.
Only if the qualifiers are not Items themselves, case 3 could look more complicated. I.e. in case of point in time, which could just be the year, but sometimes is a certain data. In wikidata I would use FILTER for the qualifier. As a workaround we could use case 1 and do the filtering in Open Refine later.
As a workaround we could use case 1 and do the filtering in Open Refine later.
The problem with 1. is that it would only fetch the qualifier values, not the main statement values, so it is not clear to me how you can use it to reimplement 2 or 3 by adding local filtering afterwards.
P3602[P768=Q96984689]
(all main statement values onP3602
statements withP768=Q96984689
qualifier)I do not see a clean way to implement this given the existing API.
Do you see other use cases not covered by these points? Which of those use cases would be useful to you?
@wetneb : fine for 1. and 2. But why not P3602#P768=Q96984689
for 3.? And for 4.: why not Pxxx#*
?
Regards, Antoine
As a workaround we could use case 1 and do the filtering in Open Refine later.
The problem with 1. is that it would only fetch the qualifier values, not the main statement values, so it is not clear to me how you can use it to reimplement 2 or 3 by adding local filtering afterwards.
I thought it would only work in multiple steps. In my case (total revenue and point in time) I would try:
But you are right. It would only work if I could use the values of a column as qualifiers in my query.
@antoine2711 for 4., the problem is not to find a syntax for it, but rather to see how it would fit in the protocol. At the moment, when the user requests a property, we can only return one column for it.
I guess one hacky workaround would be to let the user fetch the full JSON of the statements, and we would let them manipulate that themselves in OpenRefine. After all, there is a ton more fields we are not exposing (ranks, references…) and it is unlikely we can find a satisfactory syntax to fetch all those fields, so it would be good to have this fallback option for power users.
It would still be more convenient than having to query the Wikibase API directly.
The problem with 1. is that it would only fetch the qualifier values, not the main statement values
Oh! I see @wetneb. So, the problem is bring the structure in OR? Why couldn't 2 columns be brought at the same time? I understand it requires creating rows at 2 levels, the outer statements and the inner qualifiers. But still, is that so complicated?
Also, OR has a (not very functional) grouping of column, like what you get from importing XML or JSON. Could that mechanism be reused?
I write that because, for me, in all 4 scenarii, I would like the statement value AND the qualifier's property AND the value of the qualifier's property.
Regards, Antoine
All I can say is that I do not know how that should be implemented. Again, proposals and pull requests are welcome.
I guess one hacky workaround would be to let the user fetch the full JSON of the statements, and we would let them manipulate that themselves in OpenRefine.
That would be great in many ways. Because, we could expand the syntax to add @ and the source property, with the same logic.
For the access of that data, since all those query starts from a recon column, maybe add fields to the recon...
Or, in the new column, save the data as a new recondata object. It would save either recon or values, and the cell of the initial recon column (the element of the statement).
In the same logic, we could want to have columns of reconcialied property that could replace properties in the Wikidata schema.
So the recondata could have a type of statement value, statement property, qualifier property or qualifier value, source property, or source value.
Expanding this logic seams quite in phase with the wikibase généralisation (though another topic).
Sorry @wetneb and the others if I am OT with too much OpenRefine, it's just here the two are so link/dependant of each other in my view.
Regards, Antoine
I have just received a request via email from another user who would find this very helpful.
It would be very useful for data extension for Wikimedia Commons' structured data, as P170 is usually described with several qualifiers there.
I have just received a request via email from another user who would find this very helpful.
It would be very useful for data extension for Wikimedia Commons' structured data, as P170 is usually described with several qualifiers there.
That user is me :-) . I like @wetneb 's solution to enable loading full statement JSONs. This would solve many possible feature requests in one go :)
There is currently no way to fetch qualifiers in the data extension API (or to refine during reconciliation). A syntax for such qualifiers should be picked and implemented.