Open zednis opened 9 years ago
Also, as an interesting exercise, note that 11 of the NCA3 findings are "report findings" and are thus findings of the entire report, not of particular chapters. https://data.globalchange.gov/report/nca3/finding?page=8
@justgo129 did you mean to close this ticket?
Oops, didn't mean to, I hit the wrong button. I'm sorry. Thanks for catching that.
https://data.globalchange.gov/sparql
something is wrong here? I could not get any query results at this moment...
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gcis: <http://data.globalchange.gov/gcis.owl#>
PREFIX cito: <http://purl.org/spar/cito/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbpprop: <http://dbpedia.org/property/>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
SELECT
DISTINCT ?finding, ?journal
FROM <http://data.globalchange.gov>
WHERE {
<http://data.globalchange.gov/report/nca3> gcis:hasChapter ?chapter .
<http://data.globalchange.gov/report/nca3> gcis:hasFinding ?finding .
OPTIONAL{ ?chapter gcis:hasFinding ?finding .}
?finding cito:cites ?publication .
?publication dcterms:isPartOf ?journal .
FILTER (regex(?journal, "nature", "i") || regex(?journal, "^http://data.globalchange.gov/journal/science$", "i"))
}
Interesting approach. Science and Nature were meant to be examples and not meant to be all inclusive though. Can we generalize this to incorporate everything with a certain impact factor?.
On Tue, Aug 18, 2015 at 4:01 PM, lic10 notifications@github.com wrote:
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX gcis: http://data.globalchange.gov/gcis.owl# PREFIX cito: http://purl.org/spar/cito/ PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX dbpprop: http://dbpedia.org/property/ PREFIX prov: http://www.w3.org/ns/prov# PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX dcterms: http://purl.org/dc/terms/ SELECT DISTINCT ?finding, ?journal FROM http://data.globalchange.gov WHERE { http://data.globalchange.gov/report/nca3 gcis:hasChapter ?chapter . http://data.globalchange.gov/report/nca3 gcis:hasFinding ?finding . OPTIONAL{ ?chapter gcis:hasFinding ?finding .} ?finding cito:cites ?publication . ?publication dcterms:isPartOf ?journal . FILTER (regex(?journal, "nature", "i") || regex(?journal, "^ http://data.globalchange.gov/journal/science$", "i")) } — Reply to this email directly or view it on GitHub https://github.com/USGCRP/gcis-ontology/issues/116#issuecomment-132333504 .
Justin Goldstein, Ph.D. Advance Science Climate Data and Observing Systems Coordinator US Global Change Research Program 1800 G Street NW, Suite 9100, (Note New Address) Washington, D.C. 20006, U.S.A.
O: (202) 419-3496 M: (202) 285-3005
e-mail: jgoldstein AT usgcrp Dot gov http://www.globalchange.gov
Some more explanations about the "FILTER" condition: this is a bit tricky here since all the journals contain the "nature" string belong to "Nature" while for "Science" we need to satisfy an entire string match since all the other partial matches like "Biogeosciences" are wrong.
@justgo129 I'm not quite sure what the question is asking.
Could you please explain a bit more regarding the following two sentences?
Sure.
Also, don't forget that the Annals of the Association of American Geographers, the New England Journal of Medicine, etc. would have an impact factor equivalent to that of Science, Nature, etc. The query would need to be all inclusive to cover articles from journals of any title that meet a certain impact factor.
On Tue, Aug 18, 2015 at 4:08 PM, lic10 notifications@github.com wrote:
@justgo129 https://github.com/justgo129 I'm not quite sure what the question is asking.
Could you please explain a bit more regarding the following two sentence?
1.
"calculate the percentage of total references" 2.
"generalize this to incorporate everything with a certain impact factor"
— Reply to this email directly or view it on GitHub https://github.com/USGCRP/gcis-ontology/issues/116#issuecomment-132334984 .
Justin Goldstein, Ph.D. Advance Science Climate Data and Observing Systems Coordinator US Global Change Research Program 1800 G Street NW, Suite 9100, (Note New Address) Washington, D.C. 20006, U.S.A.
O: (202) 419-3496 M: (202) 285-3005
e-mail: jgoldstein AT usgcrp Dot gov http://www.globalchange.gov
@justgo129 The impact factors are currently not included in the database and they would change over time. Could you please just provide a list of what journals you would like to take into account for this question? Otherwise it would be too complicated ...
@lic10 I suggest examining http://dbpedia.org/ontology/impactFactor (per Curt Tilmes's comment earlier, actually). Also, this is envisioned as a federated query, involving the mining of information from other databases like Web of Science, etc. Impact factors don't actually change much over time.
@justgo129 I am creating rdf triples for the impact factor of journals using some data I found online. When it's done, we could load the triples in the database and do the sparql query regarding impact factor. Here's another question: should we add this in our GCIS ontology? gcis:Journal gcis:has2014ImpactFactor xsd:decimal
Excellent. I don't think we should add the "gcis:has2014ImpactFactor" predicate to our ontology but I can be convinced otherwise.
Here are the rdf triples:
https://drive.google.com/file/d/0B4GwxoO9tVwJZWFZSDhrS2R2dGs/view?usp=sharing
We shouldn't have to load triples. A federated query can use the triples from dbpedia and query both places dynamically.
The following two queries are both working on the GCIS sparql endpoint:
i. relate GCIS findings to journals and their issn:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gcis: <http://data.globalchange.gov/gcis.owl#>
PREFIX cito: <http://purl.org/spar/cito/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT
DISTINCT ?finding, ?journal, ?issn
WHERE {
<http://data.globalchange.gov/report/nca3> gcis:hasChapter ?chapter .
<http://data.globalchange.gov/report/nca3> gcis:hasFinding ?finding .
OPTIONAL{ ?chapter gcis:hasFinding ?finding .}
?finding cito:cites ?publication .
?publication dcterms:isPartOf ?journal .
?journal bibo:issn ?issn .
}
ii. find all the journal issn's and their impact factors from dbpedia:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?issn, ?impactfactor
FROM <http://dbpedia.org>
WHERE {
SERVICE <http://dbpedia.org/sparql> {
?dbjournal a dbo:AcademicJournal .
?dbjournal dbo:issn ?issn .
?dbjournal dbo:impactFactor ?impactfactor .
}
}
I have difficulty combing them as a federated query. Still trying. Any suggestions?
After combing the two queries as this:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gcis: <http://data.globalchange.gov/gcis.owl#>
PREFIX cito: <http://purl.org/spar/cito/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT
DISTINCT ?finding, ?journal, ?issn, ?impactfactor
WHERE {
<http://data.globalchange.gov/report/nca3> gcis:hasChapter ?chapter .
<http://data.globalchange.gov/report/nca3> gcis:hasFinding ?finding .
OPTIONAL{ ?chapter gcis:hasFinding ?finding .}
?finding cito:cites ?publication .
?publication dcterms:isPartOf ?journal .
?journal bibo:issn ?issn .
SERVICE <http://dbpedia.org/sparql> {
?dbjournal a dbo:AcademicJournal .
?dbjournal dbo:issn ?issn .
?dbjournal dbo:impactFactor ?impactfactor .
}
}
I got the following error: Virtuoso 22023 Error SR012: Function aref needs a string or an array as argument 1, not an arg of type DB_NULL (204)
Do we know if the version of virtuoso we are using supports federated queries?
I do not know the answer. i will try it on the dbpedia sparql endpoint, too.
I was able to successfully run this query on the GCIS endpoint, so it does support the SERVICE keyword
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gcis: <http://data.globalchange.gov/gcis.owl#>
PREFIX cito: <http://purl.org/spar/cito/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT
DISTINCT ?impactfactor
WHERE {
SERVICE <http://dbpedia.org/sparql> {
?dbjournal a dbo:AcademicJournal .
?dbjournal dbo:issn ?issn .
?dbjournal dbo:impactFactor ?impactfactor .
}
}
I did find this review of federated query support from 2013 which indicates virtuoso 6.1 does have some federated query support, but it does not support federated BINDINGS
https://www.insight-centre.org/sites/default/files/publications/1306.1723v1.pdf
This is also not working on the dbpedia sparql endpoint:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gcis: <http://data.globalchange.gov/gcis.owl#>
PREFIX cito: <http://purl.org/spar/cito/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT
DISTINCT ?finding, ?journal, ?issn, ?impactfactor
WHERE {
SERVICE <https://data.globalchange.gov/sparql> {
<http://data.globalchange.gov/report/nca3> gcis:hasChapter ?chapter .
<http://data.globalchange.gov/report/nca3> gcis:hasFinding ?finding .
OPTIONAL{ ?chapter gcis:hasFinding ?finding .}
?finding cito:cites ?publication .
?publication dcterms:isPartOf ?journal .
?journal bibo:issn ?issn .
}
SERVICE <http://dbpedia.org/sparql> {
?dbjournal a dbo:AcademicJournal .
?dbjournal dbo:issn ?issn .
?dbjournal dbo:impactFactor ?impactfactor .
}
}
This seems to work. I am not sure why what you have been trying does not work...
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gcis: <http://data.globalchange.gov/gcis.owl#>
PREFIX cito: <http://purl.org/spar/cito/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX bibo: <http://purl.org/ontology/bibo/>
SELECT DISTINCT ?finding ?journal ?issn1 ?impactfactor
WHERE {
FILTER(?issn1 = ?issn2)
SERVICE <https://data.globalchange.gov/sparql> {
{ <http://data.globalchange.gov/report/nca3> gcis:hasChapter ?chapter . ?chapter gcis:hasFinding ?finding } UNION { <http://data.globalchange.gov/report/nca3> gcis:hasFinding ?finding }
?finding cito:cites ?publication .
?publication dcterms:isPartOf ?journal .
?journal bibo:issn ?issn1 .
}
SERVICE <http://dbpedia.org/sparql> {
?journal2 a dbo:AcademicJournal .
?journal2 dbo:issn ?issn2 .
?journal2 dbo:impactFactor ?impactfactor .
}
} LIMIT 10
hmm, I am now uncertain the query I posted works. The value of ?issn2 shown in the select does not appear to match the value for the issn if you go to the instance URI...
The result is not correct. One single "issn" matches with a bunch of "impactfactor".
I think this query works, but it times out if you have a limit value greater than 7. I will attempt some refactoring to see if I can make it more efficient
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gcis: <http://data.globalchange.gov/gcis.owl#>
PREFIX cito: <http://purl.org/spar/cito/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX bibo: <http://purl.org/ontology/bibo/>
SELECT DISTINCT ?finding ?journal ?journal2 ?issn1 ?issn2 ?impactfactor
WHERE {
FILTER(str(?issn2) = ?issn1)
SERVICE <https://data.globalchange.gov/sparql> {
{ <http://data.globalchange.gov/report/nca3> gcis:hasChapter ?chapter . ?chapter gcis:hasFinding ?finding } UNION { <http://data.globalchange.gov/report/nca3> gcis:hasFinding ?finding }
?finding cito:cites ?publication .
?publication dcterms:isPartOf ?journal .
?journal bibo:issn ?issn1 .
}
SERVICE <http://dbpedia.org/sparql> {
?journal2 a dbo:AcademicJournal .
?journal2 dbo:issn ?issn2 .
?journal2 dbo:impactFactor ?impactfactor .
}
} limit 7
If federated query could not give us the right result, I suggest we load the rdf triples I created (gcis:Journal gcis:has2014ImpactFactor xsd:decimal) and do the query within the gcis endpoint.
On Monday, August 24, lic10 wrote:
If federated query could not give us the right result, I suggest we load the rdf triples I created (gcis:Journal gcis:has2014ImpactFactor xsd:decimal) and do the query within the gcis endpoint.
I disagree. We do not want to maintain triples generated elsewhere. The triple store is rebuilt on every release and we are not maintaining a mechanism for repopulating using subsets of external datasets.
Brian
I agree with @bduggan. @lic10 please inform as to the best way to accomplish this. Thanks a million.
@justgo129 Unfortunately at this moment correct results could not be obtained using federated query.
@CurtTilmes could you be of assistance?
Not much to add...
I agree with Brian, I wouldn't try to pull impact factors into your triple store -- we just want to use the external judgements on journals (e.g. impact factor in this particular case, but it could be any other external factor that someone wants to use to filter journals. Other databases have many facts about journals).
Thanks, @CurtTilmes. @lic10 try http://academia.stackexchange.com/questions/3/where-can-i-find-the-impact-factor-for-a-given-journal
@lic10 I'm just checking on the status of this ticket.
@justgo129 I could get the impact factors. The problem comes from the federated query we are trying to use.
Per 10/13 discussion, let's table this until after we update virtuoso. We'll then retest the federated query and tweak if need be for performance reasons.
"Locate all findings based at least partially on articles with a top journal ranking (e.g. Nature and Science) per an official citation metric of one's choice, and calculate the percentage of total references."