lvaudor / glitter

an R package which writes SPARQL queries
https://lvaudor.github.io/glitter
44 stars 5 forks source link

Keep same variable name when using spq_mutate, spq_summarise #97

Closed lvaudor closed 1 year ago

lvaudor commented 1 year ago

For now it is not possible to use spq_mutate or spq_summarise re-using same variable name e.g.

tib=spq_init() %>%
  spq_add("?film wdt:P31 wd:Q11424",.label="film") %>%
  spq_add("?film wdt:P577 ?date") %>%
  spq_mutate(date=year(date)) %>% 
  spq_head(10) %>% 
  spq_perform()

generates an error whereas it'd work fine with date2=year(date)

maelle commented 1 year ago

Note: if we make this work we need to make it work for more than 2 levels of doing this, say

SELECT ?film ?date0 (YEAR(?date0) AS ?date) (STRLEN(?date) AS ?bla)
WHERE{

?film wdt:P31 wd:Q11424.
?film wdt:P577 ?date0.

SERVICE wikibase:label { bd:serviceParam wikibase:language "en".}
}

LIMIT 10

However I'm not sure it should work?

maelle commented 1 year ago
SELECT ?film ?date (YEAR(?date) AS ?year)
WHERE{

?film wdt:P31 wd:Q11424.
?film wdt:P577 ?date.

SERVICE wikibase:label { bd:serviceParam wikibase:language "en".}
}

LIMIT 10

obviously works

but

SELECT ?film (YEAR(?date) AS ?date)
WHERE{

?film wdt:P31 wd:Q11424.
?film wdt:P577 ?date.

SERVICE wikibase:label { bd:serviceParam wikibase:language "en".}
}

LIMIT 10

doesn't, the result is weird, dates are things such as http://www.wikidata.org/.well-known/genid/93aa88ef43b35a7e29b7f52c8ff3bed8

maelle commented 1 year ago

So what should happen?

lvaudor commented 1 year ago

I think glitter should automatically make some replacements!... Yeah I know...

lvaudor commented 1 year ago

I guess for this issue and others we'll have to re-think the way we keep track of the variable names, i.e. have an element of the query that contains all variables mentioned or created in the query, and one containing SELECTed ones...

maelle commented 1 year ago

currently reading/refactoring code with the view of tracking more next week

maelle commented 1 year ago

Having a tibble with for each variable

maelle commented 1 year ago

the order is important for defining the tibble, or we need to be able to re-order triples intelligently.

maelle commented 1 year ago

e.g. if one only has spq_add("?city wdt:P1082 ?pop", .required = FALSE) then a triple that'd have any of the two variables in it, with no other variable, would take precedence in being the "defining" triple.

maelle commented 1 year ago

3 ways to define a variable: in a triple, in a VALUES thing (VALUES ?species {wd:Q144 wd:Q146 wd:Q780}), in a formula (COUNT(?blabla) AS ?blop).

maelle commented 1 year ago

I wonder how much order should matter. Maybe there should be a strict mode (controlled by an option). With the strict mode there could be linting!

related: #39

maelle commented 1 year ago

the linting could still be separate though. :thinking: (a separate function)

maelle commented 1 year ago

148

library('glitter')
spq_init() %>%
  spq_add("?film wdt:P31 wd:Q11424") %>%
  spq_label(film) %>%
  spq_add("?film wdt:P577 ?date") %>%
  spq_mutate(date=year(date)) %>% 
  spq_head(10) %>% 
  spq_perform()
#> # A tibble: 10 × 3
#>    film                                   date film_label                    
#>    <chr>                                 <dbl> <chr>                         
#>  1 http://www.wikidata.org/entity/Q32786  2012 916                           
#>  2 http://www.wikidata.org/entity/Q32790  1971 Red Sun                       
#>  3 http://www.wikidata.org/entity/Q32910  2005 Domino                        
#>  4 http://www.wikidata.org/entity/Q32910  2005 Domino                        
#>  5 http://www.wikidata.org/entity/Q33109  2008 Be Like Others                
#>  6 http://www.wikidata.org/entity/Q33131  2008 Nothing like the Holidays     
#>  7 http://www.wikidata.org/entity/Q33139  2009 To Die like a Man             
#>  8 http://www.wikidata.org/entity/Q33148  2009 Dead like Me: Life After Death
#>  9 http://www.wikidata.org/entity/Q33191  2012 People Like Us                
#> 10 http://www.wikidata.org/entity/Q33191  2012 People Like Us

Created on 2023-07-27 with reprex v2.0.2

maelle commented 1 year ago
library("glitter")
spq_init() %>%
  spq_add("?item wdt:P361 wd:Q297853") %>%
  spq_add("?item wdt:P1082 ?folkm_ngd") %>%
  spq_add("?area wdt:P31 wd:Q1907114") %>%
  spq_label(area) %>%
  spq_add("?area wdt:P527 ?item") %>%
  spq_group_by(area, area_label)  %>%
  spq_summarise(folkm_ngd = sum(folkm_ngd)) %>%
  spq_perform()
#> # A tibble: 1 × 3
#>   area                                   area_label     folkm_ngd
#>   <chr>                                  <chr>              <dbl>
#> 1 http://www.wikidata.org/entity/Q297853 Øresund Region   4102502

Created on 2023-07-28 with reprex v2.0.2