Open WolfgangFahl opened 2 years ago
I have used the sparqlquery command line tool from https://github.com/WolfgangFahl/pyLoDStorage to show the details of the query which is name "ACL-Paper2Event" in https://github.com/WolfgangFahl/pyLoDStorage/blob/master/sampledata/scholia.yaml:
sparqlquery -qp scholia.yaml -qn "ACL-Paper2Event" -f github
# ACL Anthology article ID
SELECT ?article ?articleLabel ?aclId ?publishedIn ?publishedInLabel ?event ?eventLabel WHERE {
#ACL Anthology article ID
?article wdt:P7505 ?aclId.
?article rdfs:label ?articleLabel .
#?aclIdStatement (ps:P7505) ?aclId.
?article wdt:P1433 ?publishedIn.
?publishedIn rdfs:label ?publishedInLabel .
#OPTIONAL {
# is proceedings from
?publishedIn wdt:P4745 ?event.
?event rdfs:label ?eventLabel.
#}
} LIMIT 50
article | articleLabel | aclId | publishedIn | publishedInLabel | event | eventLabel |
---|---|---|---|---|---|---|
Q79020060 | Common Voice: A Massively-Multilingual Speech Corpus | 2020.lrec-1.520 | Q95997327 | Proceedings of The 12th Language Resources and Evaluation Conference | Q61919909 | 12th Conference on Language Resources and Evaluation |
Q79020060 | Common Voice: A Massively-Multilingual Speech Corpus | 2020.lrec-1.520 | Q95997327 | Proceedings of The 12th Language Resources and Evaluation Conference | Q61919909 | 12th Conference on Language Resources and Evaluation |
Q79020060 | Common Voice: A Massively-Multilingual Speech Corpus | 2020.lrec-1.520 | Q95997327 | Proceedings of The 12th Language Resources and Evaluation Conference | Q61919909 | 12th Conference on Language Resources and Evaluation |
Q79020060 | Common Voice: A Massively-Multilingual Speech Corpus | 2020.lrec-1.520 | Q95997327 | Proceedings of The 12th Language Resources and Evaluation Conference | Q61919909 | 12th Conference on Language Resources and Evaluation |
Q79020060 | Common Voice: A Massively-Multilingual Speech Corpus | 2020.lrec-1.520 | Q95997327 | Proceedings of The 12th Language Resources and Evaluation Conference | Q61919909 | 12th Conference on Language Resources and Evaluation |
Q79020060 | Common Voice: A Massively-Multilingual Speech Corpus | 2020.lrec-1.520 | Q95997327 | Proceedings of The 12th Language Resources and Evaluation Conference | Q61919909 | 12th Conference on Language Resources and Evaluation |
Q79020060 | Common Voice: A Massively-Multilingual Speech Corpus | 2020.lrec-1.520 | Q95997327 | Proceedings of The 12th Language Resources and Evaluation Conference | Q61919909 | 12th Conference on Language Resources and Evaluation |
Q79020060 | Common Voice: A Massively-Multilingual Speech Corpus | 2020.lrec-1.520 | Q95997327 | Proceedings of The 12th Language Resources and Evaluation Conference | Q61919909 | 12th Conference on Language Resources and Evaluation |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q61895831 | The word analogy testing caveat | N18-2039 | Q55434859 | Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) | Q75696024 | The 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |
Q110887400 | The Power of Scale for Parameter-Efficient Prompt Tuning | 2021.emnlp-main.243 | Q109517629 | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing | Q109517651 | The 2021 Conference on Empirical Methods in Natural Language Processing |
Q110887400 | The Power of Scale for Parameter-Efficient Prompt Tuning | 2021.emnlp-main.243 | Q109517629 | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing | Q109517651 | The 2021 Conference on Empirical Methods in Natural Language Processing |
Q110887400 | The Power of Scale for Parameter-Efficient Prompt Tuning | 2021.emnlp-main.243 | Q109517629 | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing | Q109517651 | The 2021 Conference on Empirical Methods in Natural Language Processing |
Q110887400 | The Power of Scale for Parameter-Efficient Prompt Tuning | 2021.emnlp-main.243 | Q109517629 | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing | Q109517651 | The 2021 Conference on Empirical Methods in Natural Language Processing |
Q108673464 | Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia | 2020.emnlp-demos.4 | Q108673475 | Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations | Q82290350 | The 2020 Conference on Empirical Methods in Natural Language Processing |
Q108673464 | Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia | 2020.emnlp-demos.4 | Q108673475 | Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations | Q82290350 | The 2020 Conference on Empirical Methods in Natural Language Processing |
Q108673464 | Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia | 2020.emnlp-demos.4 | Q108673475 | Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations | Q82290350 | The 2020 Conference on Empirical Methods in Natural Language Processing |
Q108673464 | Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia | 2020.emnlp-demos.4 | Q108673475 | Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations | Q82290350 | The 2020 Conference on Empirical Methods in Natural Language Processing |
Q108673464 | Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia | 2020.emnlp-demos.4 | Q108673475 | Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations | Q82290350 | The 2020 Conference on Empirical Methods in Natural Language Processing |
Q108673464 | Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia | 2020.emnlp-demos.4 | Q108673475 | Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations | Q82290350 | The 2020 Conference on Empirical Methods in Natural Language Processing |
Q108673464 | Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia | 2020.emnlp-demos.4 | Q108673475 | Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations | Q82290350 | The 2020 Conference on Empirical Methods in Natural Language Processing |
Q108673464 | Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia | 2020.emnlp-demos.4 | Q108673475 | Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations | Q82290350 | The 2020 Conference on Empirical Methods in Natural Language Processing |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q107060118 | The Danish Gigaword Corpus | 2021.nodalida-main.46 | Q107059887 | Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31-June 2, 2021 | Q102274071 | The 23rd Nordic Conference on Computational Linguistics |
Q105730737 | DanNet2: Extending the coverage of adjectives in DanNet based on thesaurus data (project presentation) | 2021.gwc-1.31 | Q105730699 | Proceedings of the 11th Global Wordnet Conference | Q105730832 | The 11th Global WordNet Conference |
Q105730737 | DanNet2: Extending the coverage of adjectives in DanNet based on thesaurus data (project presentation) | 2021.gwc-1.31 | Q105730699 | Proceedings of the 11th Global Wordnet Conference | Q105730832 | The 11th Global WordNet Conference |
Q105730737 | DanNet2: Extending the coverage of adjectives in DanNet based on thesaurus data (project presentation) | 2021.gwc-1.31 | Q105730699 | Proceedings of the 11th Global Wordnet Conference | Q105730832 | The 11th Global WordNet Conference |
Q105730737 | DanNet2: Extending the coverage of adjectives in DanNet based on thesaurus data (project presentation) | 2021.gwc-1.31 | Q105730699 | Proceedings of the 11th Global Wordnet Conference | Q105730832 | The 11th Global WordNet Conference |
Q107009138 | Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training | 2021.naacl-main.278 | Q107009154 | Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies | Q107009143 | 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics |
Q107009138 | Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training | 2021.naacl-main.278 | Q107009154 | Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies | Q107009143 | 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics |
Is there any reason you want to do this on Wikidata, instead of using the XML/YAML files we have in this repo?
(FWIW I'm not aware of any Anthology maintainer being involved in Wikidata, so I would be surprised if anyone of us could help you there.)
@mbollmann thx for the swift reply. Wikidata is just a good environment especial given the scholia project. See https://scholia.toolforge.org/event-series/Q56571145 for an entry for an example event. https://www.wikidata.org/wiki/Property:P7505 states that there are potentially 50.000 articles. On the aclanthology website I found "The ACL Anthology currently hosts 74465 papers on the study of computational linguistics and natural language processing. "
Indeed i might be interested in analysing the XML/YAML files and look for conference proceedings. It looks like there has not been a bot yet transferring the entries to wikidata (the wikicite project)
I see. I'm not familiar with the Scholia project unfortunately; I do know Wikidata, but I am not aware of any transfer between the ACL Anthology and Wikidata, or who might have done it for the entries that already exist there.
Here's a quick example of what you can get from our Python library (in bin/
):
>>> ant = Anthology("../data/")
>>> paper = ant.papers["2020.lrec-1.520"]
>>> ant.volumes[paper.parent_volume_id].get_title()
'Proceedings of the 12th Language Resources and Evaluation Conference'
>>> ant.venues.get_main_venue("2020.lrec-1.520")
'LREC'
>>> ant.venues.get_by_acronym("LREC")["name"]
'International Conference on Language Resources and Evaluation'
...where "2020.lrec-1.520"
can be any ACL paper ID, of course. The information is pulled from the XML/YAML files in the data/
directory, so of course you could also use other tools to extract data from them.
For my research i try to trace from paper to events (conferences) The following SPARQL query gives some good results but the result seems to be incomplete. How could this situation be improved?
try it