rug-compling / alpinocorpus

Library for handling Alpino corpora
GNU Lesser General Public License v2.1
8 stars 1 forks source link

Proposal: queryWithStylesheet #14

Closed danieldk closed 12 years ago

danieldk commented 12 years ago

We have long been considering whether to add support for XSL tranformations to alpinocorpus. Until now, we decided against, because it was cleaner to let the library user do transformations. However, this poses a problem to a future RemoteCorpusReader: it should implement the CorpusReader interface (and nothing more), but we cannot realistically expect a client to download every XML file matching a query to apply transformations (e.g. in the sentence widget of Dact).

We cannot just implement transformations in the server, because then only RemoteCorpusReader would provide this functionality.

Summary: we need library/server-side XSL transformations.

My proposal is to add a method to CorpusReader:

EntryIterator CorpusReader::queryWithStylesheet(QueryDialect d, std::string const &q,
    std::string const &stylesheet, std::list<MarkerQuery> const &markerQueries) const;

If this method is called, it will return a normal EntryIterator, with an overloaded version of the ill-fated contents() method. Calling the contents() method would:

The markerQueries argument can be used to mark nodes using queries (like readMarkQueries). We could have some default behavior, where, if the argument is unspecified, it will use the query specified in the second argument to mark nodes.

This change would make it possible to:

Any comments?

danieldk commented 12 years ago

Done in e8166947560f99063273f22900bc546167e5e445. Hopefully good enough for now...