Closed justgo129 closed 9 years ago
To troubleshoot a query like this I normally break it down into simpler queries and then when I find that statement that is causing trouble (prov:wasAttributedTo
in this case) I will do describe statements on the subject of the problematic statement pattern.
I broke your query down to this
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX prov: <http://www.w3.org/ns/prov#>
#select distinct ?dataset
describe ?dataset
#?instrument_instance
#?platform_title
#(count(?platform_title) AS ?platform_name)
where {
?dataset a gcis:Dataset .
# ?dataset prov:wasAttributedTo ?instrument_instance .
# ?instrument_instance a gcis:Instrument .
# ?instrument_instance gcis:inPlatform ?platform .
# ?platform dcterms:title ?platform_title
} LIMIT 5
and from the output of the describe statements noticed that the datasets in the triplestore are associated to the instrument instances via prov:wasDerivedFrom
instead of prov:wasAttributedTo
.
The REST API shows prov:wasAttributedTo
: http://data.globalchange.gov/dataset/nasa-nsidcdaac-0032.thtml
@bduggan any idea why this may be? I know we changed the template from prov:wasDerivedFrom
to prov:wasAttributedTo
, but the template change was merged to master 17 days ago.
USGCRP/gcis/pull/216
EDIT - after looking at the results of the describe on datasets again I think the prov:wasDerivedFrom
statements might be valid. They are dataset -> dataset derivations. There appears to be no dataset -> instrument instance relationships in the triplestore.
I think the triplestore is also missing the newer representation of instrument instances, because this query returns 0 results as well.
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX gcis: <http://data.globalchange.gov/gcis.owl#>
select distinct ?dataset
?instrument_instance
where {
?dataset a gcis:Dataset .
{ ?dataset prov:wasAttributedTo ?instrument_instance } UNION { ?dataset prov:wasDerivedFrom ?instrument_instance } .
?instrument_instance a gcis:Instrument .
}
On Monday, August 24, Stephan Zednik wrote:
I think the triplestore is also missing the newer representation of instrument instances, because this query returns 0 results as well.
Looks like the prov namespace prefix is missing from the template, e.g.
http://data.globalchange.gov/platform/advanced-earth-observing-satellite-ii/instrument/seawinds.thtml
These are the only triples generated:
http://data.globalchange.gov/platform/advanced-earth-observing-satellite-ii/instrument/seawinds.nt
Brian
@bduggan The prov namespace was added to this template in commit https://github.com/USGCRP/gcis/commit/88a48da0cb340968e6a51e890ead3f0fdde5c5f6#diff-238eea14cfe940c0fa711f3f061da5d2 7 days ago.
If we haven't run a triplestore load in the last 7 days that could be causing the issues we are seeing with the queries.
On Tuesday, August 25, Stephan Zednik wrote:
@bduggan The prov namespace was added to this template in commit https://github.com/USGCRP/gcis/commit/88a48da0cb340968e6a51e890ead3f0fdde5c5f6#diff-238eea14cfe940c0fa711f3f061da5d2 7 days ago.
Great, it'll go out in the next release, then.
If we haven't run a triplestore load in the last 7 days that could be causing the issues we are seeing with the queries.
We'll need to do a release before a load: the templates in production do not include this change.
You can see the release on the about page, the X-API-Version header, or via announcements to the api-users list:
http://data.globalchange.gov/about
We are at 1.34:
https://github.com/USGCRP/gcis/tree/1.34
[bduggan@lubber bduggan]$ curl -v http://data.globalchange.gov | head
[...]
< X-API-Version: 1.34
Brian
Thanks @bduggan
@justgo129 We will need to try the query again after the next release and load.
sounds good.
On Tue, Aug 25, 2015 at 12:11 PM, Stephan Zednik notifications@github.com wrote:
Thanks @bduggan https://github.com/bduggan
@justgo129 https://github.com/justgo129 We will need to try the query again after the next release and load.
— Reply to this email directly or view it on GitHub https://github.com/USGCRP/gcis-ontology/issues/132#issuecomment-134654630 .
Justin Goldstein, Ph.D. Advance Science Climate Data and Observing Systems Coordinator US Global Change Research Program 1800 G Street NW, Suite 9100, (Note New Address) Washington, D.C. 20006, U.S.A.
O: (202) 419-3496 M: (202) 285-3005
e-mail: jgoldstein AT usgcrp Dot gov http://www.globalchange.gov
@zednis @bduggan I just retried the queries given yesterday's release and get the same output as previously for all scripts provided above.
Works for me:: see query and output in the commit above.
Great. The queries work for me now, except I get a list of outputs in-lieu of a count and an additional column is added in-lieu of a rename of a column. http://yasgui.org/short/Ek1A2nL2
@justgo129 could you provide an example of what you mean by "except I get a list of outputs in-lieu of a count and an additional column is added in-lieu of a rename of a column."?
Also, why are you naming the count of ?platform_title ?platform_name?
(count(?platform_title) AS ?platform_name)
this will return a count of the instruments on that platform that the dataset was attributed to.
Perhaps this should be (count(?platform_title) AS ?instruments_on_platform_attributed_to)
For example http://data.globalchange.gov/dataset/nasa-nsidcdaac-0001.thtml was attributed to 4 instruments that are installed on 2 total platforms (2 instruments per platform)
dataset | platform_title | platform_name |
---|---|---|
http://data.globalchange.gov/dataset/nasa-ornldaac-16 | "National Oceanic and Atmospheric Administration - 10"^^xsd:string | "2"^^xsd:integer |
http://data.globalchange.gov/dataset/nasa-ornldaac-16 | "National Oceanic and Atmospheric Administration - 9"^^xsd:string | "2"^^xsd:integer |
Sure, @zednis. I meant that I wrote the query expecting a count but instead got a list of information. I adjusted accordingly but still get all the entities for which 0 platforms exist. See: http://yasgui.org/short/41i80lP3
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX prov: <http://www.w3.org/ns/prov#>
select distinct ?dataset (count(?platform_title) AS ?instruments_on_platform_attributed_to) FROM <http://data.globalchange.gov> where {
?dataset a gcis:Dataset .
?dataset prov:wasAttributedTo ?instrument_instance .
?instrument_instance a gcis:Instrument .
?instrument_instance gcis:inPlatform ?platform .
?platform dcterms:title ?instruments_on_platform_attributed_to
}
group by ?dataset ?instruments_on_platform_attributed_to
having (min(?instruments_on_platform_attributed_to) > 0)
note - edited so the query is property formatted. Please use github formatting when pasting queries so they show up correctly.
@justgo129 the query above returns a count of 0 because you never specify ?platform_title in the body of the select. It has no value. You are then overwriting ?instruments_on_platform_attributed_to with the count of an unbound variable.
honestly, I am surprised the endpoint does not throw an error on this query.
After updating the query so that the title of the platform is ?platform_title, the query returns 0 results as would be expected based on the filter at the end.
Here is an updated query that lists the count of instruments on platforms and you will see there are no occurrences of a 0 count.
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX prov: <http://www.w3.org/ns/prov#>
select distinct ?dataset count(?platform_title) AS ?instruments_on_platform_attributed_to
FROM <http://data.globalchange.gov> where {
?dataset a gcis:Dataset .
?dataset prov:wasAttributedTo ?instrument_instance .
?instrument_instance a gcis:Instrument .
?instrument_instance gcis:inPlatform ?platform .
?platform dcterms:title ?platform_title
}
order by asc(?instruments_on_platform_attributed_to)
@zednis let's chat about this one at your convenience; I'd like to walk through the logic for my own understanding.
Thanks for the great one-on-one hangout earlier, @zednis. Expanded query to include datasets has been entered into the test suite https://github.com/USGCRP/gcis-sparql/pull/6. As such, closed #132.
Additions have been made to the gcis-ontology repo as well: See: https://github.com/USGCRP/gcis-ontology/pull/151 https://github.com/USGCRP/gcis-ontology/issues/132
Hi everyone, I wrote a SPARQL query today which fails to return results even though I believe it should. What am I doing incorrectly? The query is available at: http://yasgui.org/short/4JyIq8Vh .
The purpose of the query is to count the number of titles of platforms from which datasets, via "instrument instances," were derived. The answer should most definitely exceed 0. I'm pretty sure my syntax for "select distinct count" is correct - see the second answer provided at: http://stackoverflow.com/questions/1223472/sparql-query-and-distinct-count
(I recognize that the penultimate line in the query is unnecessary for the purpose of the query but added it solely for testing reasons).