Closed Ironholds closed 7 years ago
@Ironholds yes, for me it is ok!
Cool! Okay, I will tinker around wi'this this evening :). Obviously, feel free to yourself as well!
I just added some new features and changed the license to MIT
Yay! I've started integrating the geo code :)
I just looked at some of your new geo script. In the query_property functions I query asking just for English language, while in query_location functions I am using 31 different languages (that's just for the purpose of the pilot I am running here in Eurostat). I was thinking about simplyfing everything using only English or, better, put the language as another parameter of the query. What do you think?
I think using it as a parameter makes total sense; I'll built it in to this version!
Ok I'll also modify the package, it will be fast!
Done! Now, you can specify a language when you query. But if you don't the default language will automatically be English.
Great! Implementation looks the same at my end. About to push fully documented-and-implemented geo-related queries.
(Got an example of a working geo-corner-based set of corners and cities?)
Yes, try with Bruges:
Perfect!
@serenasignorelli I don't quite get the property queries (SPARQL still gives me a headache). What are they doing?
They simply perform the location query in the three ways, but in this case you also ask for the Wikidata statement 'Instance of' (P31). I kept this query separated from the location query because in this case you get fewer items (not all the geo-located items have the property P31). Moreover, with this query you don't simply get the property P31 but also its identifier (this will be used when you will query for the class)
There is an error in the first query of the get_geo_entity function.. The query should be: SELECT DISTINCT ?item ?name ?coord WHERE { ?item wdt:P131* wd:", entity, ". ?item wdt:P625 ?coord . SERVICE wikibase:label { bd:serviceParam wikibase:language \"", language, "\" . ?item rdfs:label ?name } } ORDER BY ASC (?name)") With this query you are not asking also for property (P31) and you are getting the complete number of items (in fact in the second query of the function and in the box you are correctly not asking for P31). So you also don't have to ask two times for label service.
Good catch! Fixed. Not sure how that got in there (world's weirdest typo?)
Is property-integration necessary, then? Or is the same information retrieved by calling get_item on each entity returned?
I am trying to test this, but if I run get_item("Q1492") I get this error: Error in x$type : $ operator is invalid for atomic vectors. Am I doing something wrong or there is a bug? If the function works as I expect, the property integration will not be necessary.
Looks like a bug caused by the recent refactoring! Good find! tinkers
Should be fixed in the latest commit. Apparently muggins here decided to begin integrating multi-item querying, and then not finish it, but ship it anyway. Well done Oliver.
Ok tested and it works. The problem is that if I look for a location, I get the name of the item, aliases and the description. Then it only tells me that this item has a certain number of claims and of sitelinks. When I query for property, I get as output what's in the 'Instance of' claim. So how could we fix this? We could ask for a claim (in this case P31) or leave it to the user to choose which claim(s) to ask for. I don't know if this could be done as a modified version of get_item or if it needs a different one (or the property query)
nope; that's the print method - it's a summary, to avoid overwhelming your eyeballs with information. Run str(results_of_get_item) to see the full thing!
oooh sorry! yes in fact everything is there! so after the query we need to apply this get_item to the items that we got on the query and filter only on the property that we want, right?
Exactly! And that last bit we can probably leave to the individual programmers. Unless we want to write some helper functions. Like, we could have a generic extract_property function where you give it a wikidata item (or, set of items) and a property number, and it returns a data.frame of the property values in the dataset. I'll add that as a future-thing-to-do
Perfect! I would also consider to give to the generic function not only one property number but the possibility to ask for several
Yeah, makes sense!
I was playing with get_geo functions.. In get_geo_entity, if I put only the city code, the function understands that the radius is zero. If I put the city code, the language (in the format 'fr') and the radius (just the number), it understands that I want to use the wikibase around service. But if I only put the city code and a number (the radius), the function understands the radius as zero and gives me back the same result of the first case! It works only if I put radius =
sorry, radius = number
Im not quite getting it. That is, it misunderstands if you don't use named arguments, and miss out language?
On Sunday, 19 June 2016, Serena Signorelli notifications@github.com wrote:
sorry, radius = number
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Ironholds/WikidataR/issues/18#issuecomment-227000524, or mute the thread https://github.com/notifications/unsubscribe/ACXz3m0bKwBQwAwsCl77AK6mkenNZtvEks5qNVNKgaJpZM4I0lWO .
Yes. If you run: get_geo_entity(1492) and: get_geo_entity(1492, '4') or get_geo_entity(1492, 4) You get the same output, and this is a bug. While if you run: get_geo_entity(1492, radius = 4) the function works as expected.
Not a bug in the code; a component of the language. R allows for either implicit parameters:
foo("bar", "baz")
Or explicit:
foo(a = "bar", b = "baz")
When you use implicit, it just takes the first value to be for the first argument, and the second for the first, and so on; it doesn't consider the presence of defaults. You can replicate with:
test_args <- function(a, b = NULL, c){
if(!is.null(b)){
return("you provided a value for b!")
}
return("You didn't provide a value for b")
}
test_args("foo", "bar")
ok, got it! So always better to use explicit (and tell people that they have to), or build functions with only one default parameter, placed as last parameter in the function (if I got the concept)
Exactly; I try to use explicit in the examples just to set the pattern. Or (with the example above) people can happily do:
test_args("foo", c = "bar")
and that works fine.
In fact! Thank you for the quick R lecture!
I think we're done with integrating things?
Yes, we are!
@serenasignorelli wrote an amazing package at https://github.com/serenasignorelli/QueryWikidataR - we should build it into WikidataR!
Current blockers: