humanitiesplusdesign / data-pen

Personal modeling application for Linked Data.
http://hdlab.stanford.edu/fibra
26 stars 0 forks source link

"Verify" button shouldn't show up for verified items #141

Closed alexsherman closed 7 years ago

esjewett commented 7 years ago

You can verify an item against multiple sources, so I think it probably should? Might be a better way to show this though.

cncoleman commented 7 years ago

Ethan, we should talk about verify. I thought the whole point of verifying is that it is only done once. When you have a URI for an entity, you are done. Otherwise, why verify at all?

cncoleman commented 7 years ago

The Configuration stage is supposed to map all of the sources so that each entity essentially only has one ID/URI. That way I'll know if my chosen configuration includes the entity I want to verfiy or not. Otherwise we have to think in terms of "sort of verified" or partially verified.

esjewett commented 7 years ago

It gets back the question of how well reconciled we think the sources are going to be with each other. Definitely worth discussion. Technically it's easy enough to do it either way.

cncoleman commented 7 years ago

Alex and I were just discussing this. If we have a configuration, that configuration needs to be trusted. Otherwise no one will use it. It has to be strongly mapped internally. It's worth noting that an entity only appears in one of the (10?) configured/mapped sources. But it's still in there.
If, instead, the person does not appear in our existing/chosen configuration. Does Fibra allow a person to search outside of the configuration. For example, in Wikidata (if WD is not already in the configuration)? That seems like an outlier case. But if it's the actual case, then once I verify my entity against Wikidata, I'm done. If I don't trust the match, I won't verify. If I do, I'll verify. What is the case when I would want to go back and verify against another source? Wikidata is already linked to VIAF, LOC, etc. etc.

esjewett commented 7 years ago

If I want to use the tool to reconcile 2 sources that aren't reconciled for some reason and pull properties from both of them (maybe some wikidata entry isn't reconciled against VIAF for some reason), I think verifying against both of them would be the way to do it. I suspect it may be a pretty common use case when the data in these sources gets more ragged as you get into less well-known areas.

cncoleman commented 7 years ago

But I thought we were not creating a reconciliation tool. If I search for my person and VIAF and I do find a match, I'm done. If I don't find a match, I can go to wikidata. If i find it there, I'm done. How could I reconcile those two if the person only exists in one? I see the outlier case, but mostly I just want to make sure my person has a point of reference out in some authority and go from there. It's the job of the authorities to map to each other or of the Libraries to create configurations that map them. Maybe?

esjewett commented 7 years ago

Sure, I think that's fair, and I agree we're not creating a reconciliation tool. I think we'll probably have to come back to this, but am fine with going with a "one and done" approach to reconciliation and seeing how it goes. Put it in the backlog for me? It's a quick change.

cncoleman commented 7 years ago

You bet. Thanks.

jiemakel commented 7 years ago

In an ideal world, we'd have configurations that are already completely mapped. In reality, we're very far from that, and even where mappings exists between authorities, they may not be complete.

Thus, if we don't support reconciliation between authorities inside the tool, we're designing for a world that doesn't yet exists, and thus needs to be created (=all entities between resources laboriously mapped) for each configuration to be provided.

I also think it's a bit utopistic to presume that even in the future, clean configurations will be available for all use cases and communities (although of course we want to pitch for that world, and I agree completely that it makes sense for libraries to shoulder the burden of bringing that about).

That doesn't mean that we can't initially target the clean scenario, particularly for purposes of communicating the idea, but I'm pretty sure we'll be requested to add this at a later point.

Also, regarding not creating a reconciliation tool, that's definitely true (I already have Recon for that). However, bulk reconciliation is a different act from local reconciliation - thus far, we've been designing Fibra for a world where we're mostly assuming good completely mapped configurations, but still providing the user the option to reconcile in places where these configurations don't happen to live up to that promise.

cncoleman commented 7 years ago

Let's clearly identify, then, what we we will and will not do.

All the current practice that I am aware of has only been concerned with verifying against one source. Dan verifies people against VIAF and places against Geonames. I have not yet seen anyone verifying against multiple authorities. That just seems like redundant work.

I can imagine a scenario where we have VIAF ids and we want to map those to EMLO. So I would need a tool to help with that process. ThoughI think that would in most cases be a batch process. The example came from the workshop: I upload a spreadsheet that includes a column for VIAF. Let me use that ID through FIBRA to access additional VIAF properties AND to access properties from other authorities/archives linked to VIAF.

The mode is building a database through rich access to information. What are the cases when I would go through my entities one by one and "super" verify them against multiple sources?


Nicole Coleman Digital Research Architect Stanford University Libraries +1.650.575.9958


From: Eetu Mäkelä notifications@github.com Sent: Tuesday, February 7, 2017 6:56:55 AM To: humanitiesplusdesign/fibra Cc: Catherine Nicole Coleman; Comment Subject: Re: [humanitiesplusdesign/fibra] "Verify" button shouldn't show up for verified items (#141)

In an ideal world, we'd have configurations that are already completely mapped. In reality, we're very far from that, and even where mappings exists between authorities, they may not be complete.

Thus, if we don't support reconciliation between authorities inside the tool, we're designing for a world that doesn't yet exists, and thus needs to be created (=all entities between resources laboriously mapped) for each configuration to be provided.

I also think it's a bit utopistic to presume that even in the future, clean configurations will be available for all use cases and communities (although of course we want to pitch for that world, and I agree completely that it makes sense for libraries to shoulder the burden of bringing that about).

That doesn't mean that we can't initially target the clean scenario, particularly for purposes of communicating the idea, but I'm pretty sure we'll be requested to add this at a later point.

Also, regarding not creating a reconciliation tool, that's definitely true (I already have Reconhttps://github.com/jiemakel/recon/ for that). However, bulk reconciliation is a different act from local reconciliation - thus far, we've been designing Fibra for a world where we're mostly assuming good completely mapped configurations, but still providing the user the option to reconcile in places where these configurations don't happen to live up to that promise.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/humanitiesplusdesign/fibra/issues/141#issuecomment-278023620, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AARlBTvU0oxRxBvoqhJkFdp6oH5A9Jsrks5raIY3gaJpZM4L42ur.

jiemakel commented 7 years ago

The question at the core here is I think What is verification? What end does it serve?

I've been seeing verification (or strongly identifying) as a means to an end - verification gets you additional information on the item from other archives that understand that strong identifier (at the end of the day, it also anchors the work you yourself are going to do to the data on some common signpost for others to make use of, but that's not the immediate gain).

Now, in our current environment, it may still very well be that for example we'd want to pull in information on a person from all of FBTEE, SDFB, EMLO and Electronic Enlightenment for example, but these don't share a single set of identifiers - maybe EMLO and SDFB both refer to Wikipedia IDs, while EMLO and EE link to VIAF, but FBTEE doesn't refer to any external authority.

Here, our current thinking has been that if one wants to include information from all the sources, one has to reconcile against the disparate identity sets oneself. In the example above, for any identities in all of EMLO, SDFB and EE, we'd be able to reason identities across the VIAF and Wikipedia ids to arrive at a single entry, but for an entry common to only EE and SDFB we wouldn't be able to (as one refers to VIAF and the other to Wikipedia), and FBTEE would always remain on its own.

(Well, bad example, because in reality VIAF already reconciles against Wikipedia - so again, yeah, we'd be NEARLY at the world where everything meshes nicely already on the level of configuration, but not quite completely)