danielabar / globi-proto

InfoVis 2015 IVMOOC Globi Explorer
http://danielabar.github.io/globi-proto
MIT License
2 stars 1 forks source link

suggestions for source citations: use study_url / study_citation / study_source_citation #59

Closed jhpoelen closed 9 years ago

jhpoelen commented 9 years ago

Currently, study_title is used to indicate who reported a specific interaction. For historic reasons, the study_title is used more like a unique id rather than a human readable citation. If you'd like to include human readable citations for sources, I'd like to suggest to include study_citation and study_source_citation in the fields. The first is a citation of the observation and the second is the citation of the data provider.

For example, the image below has a citation that includes both study_citation (before Provider:) and the study_source_citation (after Provider:). screen shot 2015-04-12 at 9 28 03 am

I hope this helps tidy up the references that are included in the app.

danielabar commented 9 years ago

Thanks I'll update the code to use study_citation and study_source_citation instead of study_title.

danielabar commented 9 years ago

As it turns out, I do need a unique identifier for study_title to build the observations map markers, because that api call returns seeming dupes for combos of study_title, lat and lng, which I assumed meant multiple observations.

But I will also bring in the other study attributes for friendly display purposes.

jhpoelen commented 9 years ago

@danielabar Thanks for your quick reply! Using study_title as a unique study id makes perfect sense to me, as the property should really be renamed to something like "study_id". Let me know if you need some additional info regarding the citation info.

danielabar commented 9 years ago

I'm noticing there's some cases where study_title is different, but study_citation and study_source_citation are the same. So the code thinks they're two different things and generates a list of two references. For example:

GET http://api.globalbioticinteractions.org/interaction?fields=study_title,study_citation,study_source_citation,study_url,latitude,longitude,source_taxon_name,target_taxon_name&includeObservations=true&interactionType=eats&sourceTaxon=Neophylax&targetTaxon=Bacillariophyceae&type=json.v2

danielabar commented 9 years ago

I think the issue might be that although the search request was for source_taxon=Neophylax, a second result came back for Neophylax concinnus

And for some reason, the same study gets assigned two different study id's.

I'm not sure what to do about this, I can push my changes that display the nicer study citation text, but will have dupes, unless I filter unique on actual study citation.

jhpoelen commented 9 years ago

@danielabar nicely spotted!

The data source reports two (separate) studies that appear to be the same.

From the source (i.e. SPIRE), here's the first:

...
<Study rdf:ID="s_35">
<titleAndAuthors rdf:datatype="http://www.w3.org/2001/XMLSchema#string">
<![CDATA[G. W. Minshall, Role of allochthonous detritus in the trophic structure
 of a woodland springbrook community, Ecology 48(1):139-149, from p. 148 (1967).
]]>
</titleAndAuthors>
...

and the second:

...
<Study rdf:ID="s_209">
<titleAndAuthors rdf:datatype="http://www.w3.org/2001/XMLSchema#string">
<![CDATA[G. W. Minshall, 1967.  Role of allochthonous detritus in the trophic structure of a woodland springbrook community.  Ecology 48:139-149, from pp. 145, 148.]]>
</titleAndAuthors>
...

The citation is derived from a match of the provided citation again crossref.org. So . . . it appears that you found a data error in the source.

I'd suggest to treat them as if they are different, so you'll see duplication in the citation until the data error is fixed in SPIRE importer.

thx, -jorrit