dasch-swiss / dsp-api

DaSCH Service Platform API
http://dasch-swiss.github.io/dsp-api/
Apache License 2.0
74 stars 18 forks source link

(Search panel : extended search) Redesign and extend functionalities #227

Open mrivoal opened 8 years ago

mrivoal commented 8 years ago

I think « Extended search » requires a lot more brainstorming of the whole team.

Functionalities should be extended, in order to allow users to cross-search, for example : Given 2 objects, « Book » and « Author », users should be able to search for all the Books whose « title » (property:title) includes « king » and written by Authors born between « 1700 » and « 1850 » (property : birthDate) in « England » (property:countryOfBirth).

An operator OR should be added as well (Books whose « title » (property:title) includes « king » OR « queen »).

Given the fact that RDF modelling allows (and even encourage) to decompose objects into separate entities, allowing user to cross-search through several objects is a key feature.

I think we should try to figure out what would be the most convenient way to deal with this in Salsah from a UX point of view, as well as check with backend developers such as Ben and Tobias, what are the options in Knora to implement such a functionality according to the backend.

benjamingeer commented 8 years ago

Two options considered in the last UX meeting:

mrivoal commented 8 years ago

I think we should probably discuss these options all together again because we can't reasonnably expect every project admins to write SPARQL queries templates for their team (if this is SPARQL queries you are involving).

And I don't think either a query as "basic" as the one I suggested should require more technically proficients users to write it. A graphic interface could probably be investigated here. We will try to make some suggestions and I suggest we discuss this together.

benjamingeer commented 8 years ago

I was thinking of templates written in the search language suggested above.

I think the "basic" query you suggested above would actually be quite difficult to represent clearly in a graphical interface, but I would love to be proven wrong.

loicjaouen commented 8 years ago

Isn't it possible to use the editor with wildcards? If you consider an editor like described in #231, a simple search would use the same form but leave some fields empty: search This would look for Resource matching or linked to a person named Barbeyrac in Lausanne.

This is a naïve search that we can decide to leave simple (therefore limited) on purpose.

Or it can be elaborated to specify the resources to look for (Person, Book or whatever possible Resource), the kind of match (match, equal, match bool, equal, not equal, like, not like, exists) and logical operator for similar search criteria: search_more

This is less naïve but still doesn't embrace all the possibilities of searching (logical operator grouping, versions, user activity, and much more that I can't think of).

jroland01 commented 8 years ago

@loicjaouen Do you mean using the same editor, but in a different mode (e.g. search instead of edit) ? Or building search forms, templates similar to that of the editor ? I guess the second query, written in a "language", would be something like : "type:Person personName:Barbeyrac (eventsPlace:Lausanne OR eventsPlace:Basel)" - is that right ? A question is then : What is the value added for the user in building a query in an "editor -like" fashion vs. In plain text ? You still have to build the logic. But of course you don't have to remember the fields' names (but is that a problem ?). Queries written in a "language" (probably more flexible, and faster to build) could be represented as forms (as you suggest) and used as templates, where "less advanced" users could fill in the blanks or replace values. Another point to take into account, which "less advanced" users could use, is the faceted search : e.g. you could do a simple search for "Barbeyrac", and then (in facet "Type") click on "Person" (in facet EventsPlace) click on "Lausanne" and click on "Basel". Those are just partial examples of course, but my point is that we should consider different ways of searching (simple, forms/templates, expression-based language, faceted search) depending on the objective, and of course the user's skills. Each "way of searching" should be seen in the context of others if we want to design them properly.

mrivoal commented 8 years ago

Actually, in written language, I would imagine something like this: "class:Person" ["name" (exactMatch):Barbeyrac] has link to "class:Events" ["place" (exactMatch):Lausanne OR "place" (exactMatch):Basel].

You need to know what is the current data model in order to write the right query. Plus you need to know which operator can be used with which kind of property format (available operators for query in text fields are not the same that those availble for querying a date field or an integer).

The whole point of using a form is that the form somehow guides the user through the query: here, he doesn't really need to know that in the data model a link exists between Person and Events, he just go through the form. Plus, the relevant operators would be displayed next to the property field, according to its format, so that users don't have to bother with this. This features already exists in Salsah 1.0.

benjamingeer commented 8 years ago

@loicjaouen Logical operator grouping seems like a difficult thing to represent visually.

@mrivoal If we provide an expression-based search language, it could have an intelligent editor that would prompt the user with relevant options while the user is typing, the way an IDE does, and forbid illogical ones. Thus it would guide the user through the query.

loicjaouen commented 8 years ago

@benjamingeer that's why we might just decide to keep it simple and dedicate it to simple searches. If we come to represent it though, I would suggest to sieve by trickling-down results through layers of filters. https://app.box.com/s/4vw4q1ka69kkhv0o9jo96arqg3k30eh2

jroland01 commented 8 years ago

A few comments in addition to @benjamingeer's, which I agree with. I am not in Knora/Salsah's more "technical" aspects, so forgive me for any inaccuracies there. I would suggest using as much as possible plain language and human readable forms in queries. E.g. a "class" probably doesn't mean much for an average user, while a "type" is something we frequently use, although it's probably not completely accurate from a "technical" standpoint. Most people would probably say Persons and Books are "types" of things rather than "classes" of things (there is probably a better word than "type"). I would also suggest using standards and conventions in queries. E.g. most search engines (incl. Google Search) use double-quotes (instead of (exactMatch)). Users (at least the advanced ones) are used to that, and this will help them learn more quickly.

Another question : Is it possible to implement (maybe approximate) @loicjaouen's example with a faceted search for less advanced users ? E.g. using things similar to LinkedIn search's conditional facets.

benjamingeer commented 8 years ago

A list of filters like this would be easy to translate into SPARQL:

I'm looking for B
B is of type Book
B hasTitle T
T includes "king"
B hasAuthor A
A hasBirthDate D
D is greater than or equal to 1700
D is less than or equal to 1850
A hasBirthPlace "England"

This is basically the structure of a SPARQL WHERE clause: a list of statements (subject-verb-object) about what you're looking for, using variable names for the subjects and objects. AND is implicit: all the statements must be true to get a match. You only need grouping to express OR and NOT. Thinking in terms of filters, maybe OR could be called ANY OF, and NOT could be called WITHOUT. So you could write something like this:

I'm looking for B
B is of type Book
B hasTitle T

ANY OF:
  * T includes "king"
  * T includes "queen"
  * T includes "prince"

WITHOUT:
  T includes "emperor"

B hasAuthor A
A hasBirthDate D
D is greater than or equal to 1700
D is less than or equal to 1850
A hasBirthPlace "England"
mrivoal commented 8 years ago

Your suggestion is great, Ben. But I am pretty sure that the "average" Knora/Salsah user will be quite reluctant to use a non graphical query system. Some users are willing to dig further, and for those, your suggestion will probably work. But what of those used to Filemaker, for exemple? What of those accustomed to such a query interface: https://lumieres.unil.ch/chercher/bibliographie/ ?

I think that relying only on a query langage for a basic search such the one we suggested is simply going to frighten some users and prevent them from considering Salsah as an option for their needs.

jroland01 commented 8 years ago

@benjamingeer Looks great. The challenge I see here is that this requires (to some extend) a "developer's view or model", as the syntax requires to think in terms of relationships between objects, and thus have a certain view on the data model (as suggested earlier by @mrivoal). From a UX perspective, I think it would be very useful to leverage a "user's view or model". And a user's model stems from what she sees and experiences, which is a bunch of entry forms, with labels and fields - so it's probably a rather "flat" representation of each type of thing. So I believe it would be a good idea if we can somehow leverage that, in particular labels, in the syntax.

This may look like a naive question, but if I put your query like that (below), deriving keywords/operators from form labels - why wouldn't this work ? - what would be the limitations/challenges ?

Type:Book (Title:"King" OR Title:"queen" OR Title:"prince") -(Title:"emperor") AuthorBirthDate:>1700 AuthorBirthDate:<1850 AuthorBirthPlace:"England"

@mrivoal a few questions :

benjamingeer commented 8 years ago

The language I suggested above could be implemented mostly with pull-down menus, as in the Lumières interface. This would be similar to what we have now in the SALSAH extended search. Currently, to add a search criterion, you click a button and you get a row of pull-down menus for predicate and object, but the subject is always the same resource. The main challenge here, as I see it, is to allow users to talk about relations between multiple resources. If you think they won't like variable names, we could use colours or graphical symbols that they could select from pull-down menus. This would preserve the simple list-like structure of the interface (which the Lumières interface also uses). Representing relations between multiple resources graphically, in a generic way, with support for OR and NOT, seems very hard to me. I can't imagine what a purely graphical representation of my second example above (the one with ANY OF and WITHOUT) could look like. Can you?

loicjaouen commented 8 years ago

@jroland01 :

benjamingeer commented 8 years ago

@jroland01 As @loicjaouen says, we have to support all sorts of relationships between all sorts of resources. Some examples:

As I see it, the ability to do queries like this makes it possible to do research that wouldn't be possible otherwise.

In the current SALSAH, the verbs in the examples above are called properties. The current extended search populates pull-down menus so you can select properties. The names of the properties are user-friendly names configured in the project's ontology (birth date rather than hasBirthDate), in the user's preferred language (currently English, French, German, or Italian).

If you tell SALSAH what kind of resource you're looking for, it only lets you select properties that are possible for that type of resource (because this is defined in the project's ontology). It also knows what kind of value each property can have. For example, if you select a property called birth date, SALSAH knows that the value must be a date (because this is defined in the project's ontology). So it shows you a calendar widget to select a date.

If you select a property that must point to another resource (in the examples above, faces, is the author of, was published by, is located in), SALSAH prompts you for the resource you want, using a predictive-text search. But this is not adequate, because we want to be able to talk about properties of that resource, too. So the challenge is to find a way to let the user match patterns like this:

+-> Resource A
|     property A1: value
|     property A2: value
|     property A3 ------------------> Resource B
|                                       property B1: value
|                                       property B2: value
|                                       property B3 ------------------> Resource C
|                                                                         property C1: value
|                                                                         property C2: value
|                                                                         property C3 -+
|                                                                                      |
+--------------------------------------------------------------------------------------+
jroland01 commented 8 years ago

@mrivoal A few notes regarding Lumières.Lausanne advanced search, based on my interview notes with Béatrice Lovis (you probably have even more data on that) :

My hypothesis is : Most people in the project probably don't use the full extend (far from that ?) of the search possibilities, but I may be wrong (that's why factual data would be great, such as e.g. statistics or a survey).

Another point : The "advanced search" form could be implemented using a mix of much simplified input form (e.g. words in fields) or a free text field with keywords (e.g. Titre:, Auteur:, etc.) + a faceted search (e.g. type de document, revue, etc.) + sorting. The experience delivered would probably be much better : more streamlined/better structured page, perception of simplicity which would engage users (who may otherwise be frightened by the form), intelligent rankings of values in facets based on input, ability to drill-down into the data and experiment with auto results update, etc.

Thus the following questions :

jroland01 commented 8 years ago

@loicjaouen @benjamingeer Thank you for your clarifications. Question : Do we need to make the full power/possibilities of Salsah/Knora available through the expression -based language ? Or can we afford to only have a subset of those possibilities ? Allowing for more possibilities (e.g. like more relationships, more elaborate syntax) will make it less approachable for users (even if you add "constraints" such as predictive text, etc. to guide and error -proof), who otherwise may be using it.

Do we have an idea regarding how many existing/potential Salsah users would require more advanced queries/relationships (such as those described in @benjamingeer's latest comment) ?

Just thinking out loud, but what about having both :

benjamingeer commented 8 years ago

Knora is to some extent a leap of faith: we're offering users the ability to do things that many of them have never imagined doing. We don't have the money to do market research to find out how many people would actually want to do those things. We know some who will be excited right away, and will use these tools to do new kinds of humanities research. Maybe others will see this new research and then be inspired to try these tools. It's difficult to know.

We will definitely have some users who already have experience with SQL or a programming language. We will also definitely have users with minimal technical knowledge. We don't know how many we'll have of each, but I think we have to accommodate both groups somehow. We also have to give novice users a learning path for trying more and more ambitious things.

I'm all for doing graphically what can be done graphically. But we can't make a custom search interface for each project. The search interface has to work for all projects, and that means it has to be generated automatically from the information in each project's ontology. The ontology says what types of resources there are, what properties they can have, and what relations can exist between them. Knora's data model is a graph of interconnected resources, and I don't think we can or should try to hide that from the user, because finding interesting relationships between things is a big part of what Knora is useful for. It's not just about searching through a flat list of stuff. The stained glass example is a real example from a project calked VitroCentre that stores information about stained-glass windows in Switzerland, including the physical relationships between windows in a church. For the artist who made the windows, it was meaningful that Peter was facing Paul across the church. If a relation is meaningful, we should be able to search for it.

I'm not a UX designer so I can't say what this sort of interface should look like. All I can say is that whatever the interface looks like, it will internally have to be implemented by generating something like the examples I suggested above. Those are the sorts of capabilities we need to offer, at least to more advanced users.

jroland01 commented 8 years ago

@benjamingeer Thank your for this. I understand your point.

I have attempted (see image below) to do a first "model" of Salsah's search strategy (based on our discussions in this), so that we can build a common understanding regarding the challenge. It's based on the assumption that Salsah will have different kinds of users (here much simplified as "normal" and "experts"), with different search needs depending on what they do (here much simplified as "simple", "advanced", "very advanced") :

untitled

From our discussions, I gather that :

Do you agree (to some extend !) with this ?

The problem I see is thus : How to find a suitable solution for [3] ?

My suggestion : Address [3] through a combination of faceted search ("à la LinkedIn Search"), simple search forms (if required) and simple operators that a user can input in the search field ("à la Google Search Operators"). But they are probably other solutions.

I also understand there are other criteria to take into account, such as the fact that this needs to be somehow generic.

Based on my interviews, and experience elsewhere, I strongly believe that delivering a good search experience (much improved from previous versions, up to the standards elsewhere on the web) to the user will be key in making Salsah sustainable.

@mrivoal @benjamingeer @loicjaouen Any thoughts ?

mrivoal commented 8 years ago

I am not quite convinced by this assertion yet:

I don't think we have been through all possibilities yet and I think that @jroland01, this is precisly where you could prove yourself useful to us regarding the design of the form, any drag and drop functionnalities, and so on. Hopefully, we could think of something that wouldn't be that confusing to users.

I would say that the best option would be a combination or both graphical-edit form with some search operator carefully selected in order to be meaningful to users, precisly while users are intending to search through 2 differents objets, as explained in the very post.

I think that, for now, we have all expressed our beliefs (and feelings/facts regarding the way different king of users are going to deal with serch functionnalities). I suggest that we keep everything in mind (and continue thinking about it) and take the opportunity to discuss this all together on the 5-6 september.

jroland01 commented 8 years ago

@mrivoal Just a few points regarding your post :

(a) My main point was whether we agree or not on where the challenge is with Search. I believe that clearly identifying the problem (which doesn't seem to be the case until now) is essential to finding a good solution.

(b) We can certainly think further about the graphical edit -based form, but I am not entirely convinced that by adding surface features, such as e.g. drag and drop, I would better "prove myself useful to you" (in fact making things usable requires much more than that). I believe I am also useful in challenging this solution. We may eventually see that suggested alternatives don't address the needs in [3], but at least we'll know that there isn't any better alternative.

(c) My thinking is the following : We seem to agree that a faceted navigation is required (as per your posts in #216 and #228), so why don't we try to "push it" to see if we can address the needs in [3] that way ? In fact, unless I misunderstand things, I believe we can solve the early (simple) examples in this thread, as well as Lumières.Lausanne's form, with a faceted navigation and some simple form/operator solution.

OK to further discuss on Sept. 5th (I am not aware of a discussion on the 6th), but I would suggest we take some time @ UNIL next week to prepare (as per your suggestion) - what do you think ?

jroland01 commented 8 years ago

@milchkannen @loicjaouen @mrivoal Here is a concept for Salsah's search, implementing Lumières.Lausanne's advanced form, but using a combination of techniques incl. faceted search, form-based search, simple logical / search operators and faceted filtering :

faceted search l l

faceted search l l form

faceted search l l operators

faceted search l l operators advanced

Some advantages :

(to be clear : this of course has limits, and won't be able to replace a full expression -based language, as already suggested, but this is not the purpose, see discussion in #219 )

mrivoal commented 8 years ago

Thanks, @jroland01 for the suggestion.

What worries me regarding the faceted search is that the facets you suggested here for Lumières.Lausanne cannot be implemented in a generic way: it will require for each projet to pick the most convenient facets amongst resources and properties. Typically, some facets displayed here on the same level actually involves different kind of objects: Type de document, Type de littérature, Date de parution are all properties of a Bibliography resource while the list under Projets points to the property Name of another resource, Project.

If we want to implement this in a generic way, the display should probably be different, maybe more hierarchical and more oriented towards the nature of the various elements (resources or properties). What do you think?

jroland01 commented 8 years ago

@mrivoal I understand your point. Two comments, one related to the UX, and one related to the process :

jroland01 commented 8 years ago

@mrivoal @loicjaouen @milchkannen Just figured out GitHub uses a similar search concept, but lacking the dropdown form, and the logic of the facets seems a bit weird to me :

search_ _is_open_author_jroland01_created__2016-08-01

A well written, well organized cheat sheet is also a good thing to have somewhere handy in the software.

benjamingeer commented 8 years ago

Please keep in mind that by default, Salsah searches in all data in all projects. It is possible to limit search results to one project, and it would make sense for some projects to offer a project-specific search interface. But one of the main goals of Knora is to facilitate data reuse, and that means combining data from different projects. So there has to be a generic search interface in which the user can describe relations between resource types defined in different projects.

jroland01 commented 8 years ago

@benjamingeer Yes. As per my interviews (see Kunsthalle Basel, Anton Webern notes) :

I have investigated that question during the interviews, and while showing the full list of projects in Salsah, users have mentioned that : (1) they don't have enough information available to know what the other projects are doing, (2) it is likely that most projects in the list will not be directly relevant to them (e.g. Kunsthalle Basel and Anton Webern).

I would thus suggest :

musicEnfanthen commented 8 years ago

Just saw an example for an RDF-based faceted search interface on Twitter. Check out here: https://twitter.com/fkraeutli/status/773431776568049664

crvigqawcaa4uxi jpg_large

Maybe inspiring?

jroland01 commented 8 years ago

@musicEnfanthen Very interesting example indeed, thanks !

(If you haven't already) you can watch a demo video here : https://www.youtube.com/watch?v=f5lU-D_3s7M

I believe it is a great example of :

mrivoal commented 8 years ago

Super interesting and I think this could be inspiring, you are right, @musicEnfanthen.

I think the main difference here is that they only have to deal with one onotology (CIDOC-CRM) and the drawback is that everything must fit into this ontology (and building a consensus around an ontology in cultural heritage is probably easier than in the whole range of humanities).

When it comes to research in humanities, you need more flexibily. So I think we are again facing the same problem: how is it possible to satisfy most research projects in a generic way, given the fact that each project has its own ontology?

However, sure, there are some great ideas in here!

jroland01 commented 8 years ago

@mrivoal Have you read this paper : "Hildebrand, M., & Ossenbruggen, J. van. (2006). /facet: A browser for heterogeneous semantic web repositories. Semantic Web …. Retrieved from http://link.springer.com/10.1007%2F11926078_20" ?

It's available here as PDF : http://oai.cwi.nl/oai/asset/11421/11421D.pdf

It discusses the challenges in building a faceted search in a context very similar to that of Salsah :

It also discusses advantages, disadvantages of a fully generic approach (theirs) vs. some configuration for each dataset, such as e.g.

The paper is a bit old (2006), so there is probably some more updated research, results on this.

jroland01 commented 8 years ago

@mrivoal Based on our discussions regarding faceted search and the generic search interface proposals that I have seen, I honestly believe that a fully generic interface (no interface configuration at all) will be a major hindrance in achieving an acceptable usability, or user experience for Salsah.

A fundamental reason for that is that a generic interface doesn't (easily) allow an abstraction level, and exposes the underlying data model to the user (as seen in the generic search interface proposals*), while most usability and user experience work precisely consists in abstracting from the data model (as discussed with the various facet -based search proposals) and in allowing to take into account how specific projects work with their data (through some level of configuration).

What do you think ?

mrivoal commented 8 years ago

I will have a look at the paper you mentionned.

jroland01 commented 8 years ago

@mrivoal Sure, I understand. Just a few clarifications maybe :

To move forward : Should we meet @UNIL to discuss your proposed solution ("linked forms") and to understand what the margin of manoeuvre is, given the project's assumptions (it's really hard for me to make any further proposals without interacting directly to know what is acceptable, and what is not, given the constraints) ?

mrivoal commented 8 years ago
jroland01 commented 8 years ago

A paper discussing the challenges in implementing generic research software with a good usability (more specifically in the context of a virtual research environment for the humanities) : "Harms, P., & State, G. (2011). Usability of Generic Software in e-Research Infrastructures. Journal of the Chicago Colloquium on Digital Humanities and Computer Science, 1(3), 1–18."

It's available here as a PDF : https://letterpress.uchicago.edu/index.php/jdhcs/article/download/89/98

It attempts to explain why it's not possible to achieve good usability with a fully generic software / interface aimed at different application contexts (projects), and suggests a solution.

jroland01 commented 8 years ago

Has anyone had a look at SemFacet (http://www.cs.ox.ac.uk/isg/tools/SemFacet/) ? If not, it's an example of how a faceted search could be "pushed" in the context of semantic technologies, illustrating several interesting concepts :

1 - the ability to build more advanced queries than what is possible with "standard" faceted search. E.g. find all politicians --> who have children --> who graduated from certain universities

2 - the ability to refocus the output of queries. E.g. see the children matching the query in 1 in search results (instead of seeing the politicians)

3 - the navigation map view, representing the query built through the facets. Unfortunately, the online demo seems to be down. But there is good demo video here : https://www.youtube.com/watch?v=n_uEDsTJ2KU