distributed-text-services / specifications

Specifications for the DTS API
https://w3id.org/dts
25 stars 9 forks source link

Document's Textual Search #69

Open balmas opened 6 years ago

balmas commented 6 years ago

From @jonathanrobie on September 9, 2017 21:21

Thibault's proposed amendments (https://github.com/distributed-text-services/distributed-text-services.github.io/wiki/Distributed-Text-Services-API-Proposal-Thibault-Amends) have a section called Document's Textual Search. The motivation is described thus:

Document's Textual Search

The document entry point is used to search through collection(s) of text(s). We split it from Document to allow for a simpler response format such as below. This will help us avoid problems to parse XML-TEI for textual hits.

We should discuss this in the meeting, but the motivation and design of this were not clear to me.

Copied from original issue: distributed-text-services/distributed-text-services.github.io#5

balmas commented 6 years ago

From @PonteIneptique on September 10, 2017 5:48

  1. Document retrieval based on passage identifier and document search based on textual element is fundamentally different. There is one case where the user know what to expect in terms of content while it is not the case for the other.
  2. In the context of xml tei as base format for DTS, search hits is gonna be costly to display in the base format.
  3. Having a separate API route allows for a specific result output and clarity for roles.
balmas commented 6 years ago

From @PonteIneptique on September 13, 2017 9:32

Duplicate of #3 ?

balmas commented 6 years ago

From @jonathanrobie on September 14, 2017 13:4

In the current design - with or without your amendments - document metadata lives under collections, not under documents, which represents the text of the document. I think that gives the clarity you are looking for.

I don't understand what you are saying about the cost of displaying properties. I doubt very much that servers will parse TEI at query time in order to return properties, metadata will generally be stored in some convenient queryable way before that. One strategy, of course, is to put it in an XML database.

jonathanrobie commented 6 years ago

I suspect this issue is now out of scope, since we are not doing search. Close it?

PonteIneptique commented 6 years ago

Agreed. We might reevaluate after the draft is out