BenoitTalbot / bungeni-portal

Automatically exported from code.google.com/p/bungeni-portal
0 stars 0 forks source link

Bungeni Search Indexer and Site Search #632

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
The site search functionality in bungeni is missing.

TO DO :
 -- Add a site-search functionality with a search box on the top right hand corner of every page in the site
 -- Add an advanced search option to allow searching within specific content types - and with date filters

Original issue reported on code.google.com by ashok.ha...@gmail.com on 17 Jun 2010 at 7:36

GoogleCodeExporter commented 8 years ago

Original comment by ashok.ha...@gmail.com on 17 Jun 2010 at 8:08

GoogleCodeExporter commented 8 years ago

Original comment by ashok.ha...@gmail.com on 17 Jun 2010 at 8:34

GoogleCodeExporter commented 8 years ago

Original comment by flavio.z...@gmail.com on 29 Jun 2010 at 9:13

GoogleCodeExporter commented 8 years ago
Specific content-types:

-- Parliamentary Items (Bills, Motions, Questions, Tabled Documents, Agenda 
Items ) 
-- Parliamentary Metadata (Parliaments, Members of Parliament, Government, 
Minister, Committees )

Note that the search must take into account permissions applicable to an item 
-- i.e. the search must not return items to a user which the user does not have 
permissions to 'view'.

Original comment by ashok.ha...@gmail.com on 21 Sep 2010 at 6:37

GoogleCodeExporter commented 8 years ago
Attachment also should be searched.  In advanced search a flag can be used 
(checkbox).

Original comment by maishaya...@gmail.com on 21 Sep 2010 at 9:12

GoogleCodeExporter commented 8 years ago
An updated summary of the issue is presented below 
--------------------------------------------------

Bungeni uses the Xapian indexing engine for searching parliamentary objects. 
Some of the intended benefits of using Xapian include -- 
-- able to iterate and index parliamentary objects
-- filtering of results based on permissions.

To Do:
The site search functionality in bungeni is missing.

TO DO :
 -- Add a site-search functionality with a search box on the top right hand corner of every page in the site
 -- Add an advanced search option to allow searching within specific content types - and with date filters

Specific content-types to be indexed:

-- Parliamentary Items (Bills, Motions, Questions, Tabled Documents, Agenda 
Items ) 
-- Parliamentary Metadata (Parliaments, Members of Parliament, Government, 
Minister, Committees )

Note that the search must take into account permissions applicable to an item 
-- i.e. the search must not return items to a user which the user does not have 
permissions to 'view'.

Additional requirement :
attachment also should be indexed and searched.  In advanced search a flag can 
be used (checkbox).

Original comment by ashok.ha...@gmail.com on 29 Oct 2010 at 6:35

GoogleCodeExporter commented 8 years ago
Updated Summary of this issue is presented below
------------------------------------------------

Problem Statement:
The site search functionality in bungeni is missing.

TO DO :
Add a site-search functionality with a search box on the top right hand corner 
of every page in the site

The site-search functionality should work as follows --

  1) Searching for a keyword in the search box should search indexed parliamentary content types for the keyword.
  2) It must be possible to configure which kinds of objects can be indexed (Bills, Motions, Questions, Tabled Documents, Agenda Items , Parliaments, Members of Parliament, Government, Minister, Committees, etc... )  -- and also which attributes of the object are indexed by the search indexer.
  3) The search must be permissions aware for e.g. if the member parliament searches for something - the search results must be restricted to the items the member of parliament has permissions to view -- similarly if an anonymous user searches for something - the search results must be restricted to the items visible to an anonymous user. Similarly if an MP searches for something -- only items visible to the MP must be shown.
  4) The search must be translations aware -- i.e. in bungeni content can be translated into multiple languages -- the user is allowed to switch the active language of the site ; the search made in a particular language must preferentially search for content in that language, before defaulting to the base language (english). 
  5) Attachment content types must also be indexed ; where possible known attachment binaries (e.g. pdf, word ) must also be indexed and searched.

User Interface :

1) The default search is a search box on the top right hand corner on every 
page of the website. The default search will search all indexed content types 
and attributes for the keyword. 

2) The advanced search page will allow filtering within specific kinds of 
objects (which have been configured as 'indexable'), and also allow filtering 
by the indexable attributes available on the object. Additional filter options 
: date filter to filter within specific date ranges (published date of content) 
; search within a specific language.

Search Results :

 * A link to the item with the "title" appearing as the link text. Summary text of the item must be displayed below the link. The language of the content matching the search must be displayed within the link text. e.g. "en - finance bill"

 * Search results must be paged -- the number of items on a page must be configurable via a search paramter.  e.g. If there are 50 hits matching a keyword, and the page size has been set to 10 - the first ten results only must be shown on the first page with links to page : 2, 3, 4 and 5.

 * Search keywords must be highlighted in the result document. It must be possible to also turn on/of highlighting on the result page by clicking on a link.

Current Status :

The bungeni build includes the Xapian search engine. There appears to be some 
kind of integrated search functionality which indexes parliamentary objects 
e.g. if you log in as admin and go to http://site:8081/search you get a search 
form -- and if you search for "question" or "motion" there appears to be some 
kind of result. It is not clear to what extent this has been implemented -- 
perhaps this can be a starting point for building this functionality.

Original comment by ashok.ha...@gmail.com on 12 Jan 2011 at 6:35

GoogleCodeExporter commented 8 years ago
Point (3) for the "functionality" section should probably be clarified/extended 
to take into account additional UI logic and conventions... specifically, there 
should be clarification of what should the behaviour be for sections that are 
conceived to show the same content irrespective of who the user is (and his 
permissions). 

More specifically... the BusinessLayer is intended to do exactly this, 
irrespective of the user, listings there should show always the same results 
i.e. to show only information that is public. 

The UI Layer may have other considerations on how the search interface is done, 
on whether it should really be on every page, on whether it should offer 
different search options depending on what kind of user/privileges.

Original comment by mario.ruggier@gmail.com on 13 Jan 2011 at 3:37

GoogleCodeExporter commented 8 years ago
Updated summary of this issue is presented below
------------------------------------------------

This issue is in 2 parts :
  Part 1) Search indexer for Bungeni
  Part 2) Search User Interface

Part 1)  Search Indexer for Bungeni --

The bungeni build includes the Xapian search engine. There appears to be
some kind of integrated search functionality which indexes parliamentary
objects e.g. if you log in as admin and go to http://site:8081/search you
get a search form -- and if you search for "question" or "motion" there
appears to be some kind of result. It is not clear to what extent this has
been implemented -- perhaps this can be a starting point for building this
functionality. The aim of this task is to either fix the existing indexer
functionality or replace it with another.

The following are the requirements for the search indexer --

  1.1) The indexer must be object aware. In Bungeni parliamentary  objects
& content types (Questions, Motions, Bills, Members of Parliament,
Committees, Parliaments etc. ) are abstracted as objects with methods and
attributes.  The search indexer must be able to browse and index objects.
     1.1.1) It must be possible to configure which kinds of objects can be
indexed (Bills, Motions, Questions, Tabled Documents, Agenda Items ,
Parliaments, Members of Parliament, Government, Minister, Committees, etc...
)  -- and also which attributes of the object are indexed by the search
indexer.
  1.2) The indexer must be translations aware. i.e. in bungeni content can
be translated into multiple languages -- the user is allowed to switch the
active language of the site ; the search made in a particular language must
preferentially search for content in that language, before defaulting to the
base language (english).
  1.3) Attachment contents must also be indexed ;  where possible known
attachment binaries (e.g. pdf, word ) must also be indexed and searched.

Part 2) Search User Inteface  --

   2.1) General Search -- This appears as a Search Box on the top right
hand corner of pages in Bungeni.  Searching for a keyword in this box
searches indexed parliamentary content for the keyword and returns results.

      2.1.1) Anonymous Users -- for anonymous users the search results will
return only legislative content which is public. (i.e. zope.public)
      2.1.2) Logged in Users --
            * When a user is logged in and in the "Workspace" section the
search results must be permissions aware -- i.e. they must show all
documents for which the user has view/read permissions.
            * When a user is logged in and in the "Business"  section the
search results must show only legislative content that is public (i.e. like
the behaviour for anonymous users)
            * When a user is logged in and in the "Archive" section -- the
search results will behave like in the "Business" section but also search
within  Parliamentary object information (like Parliaments, Committees,
Governments ...things which are not "Documents" )

    2.2) Advanced Search -- An advanced search page is required which allow
more detailed options for filtering search results :
        2.2.1) Allow filtering of search results by 'content type' - e.g.
search only within Questions, search only within Motions, search only within
"Committee" type object etc. This will be presented as a drop-down combo
box.
        2.2.1.1) Within the 'content type' it must be further possible to
filter by the object's indexed attributes (see 1.1.1 )
        2.2.2) Allow filtering of content by language - i.e. search for
content of a specific language
        2.2.3) Allow filtering of parliamentary document content by
"Status"
        2.2.4) Allow filtering of parliamentary document content by "Status
Date"
    2.2.5) It must be possible to apply the above filters together into one
query from the advanced search page.

     2.3) Search Results --
         2.3.1) A link to the item with the "title" appearing as the link
text. Summary text of the item must be displayed below the link. The
language of the content matching the search must be displayed within the
link text. e.g. "en - finance bill"
         2.3.2) Search results must be paged -- the number of items on a
page must be configurable via a search paramter.  e.g. If there are 50 hits
matching a keyword, and the page size has been set to 10 - the first ten
results only must be shown on the first page with links to page : 2, 3, 4
and 5.
     2.3.3) Search keywords must be highlighted in the result document. It
must be possible to also turn on/of highlighting on the result page by
clicking on a link.

Original comment by ashok.ha...@gmail.com on 3 Feb 2011 at 7:43

GoogleCodeExporter commented 8 years ago
r7900 paged search results, for test items on paged hardcorded to 1, because of 
not enouth amount of test data

Original comment by anton.op...@gmail.com on 13 Mar 2011 at 8:14

GoogleCodeExporter commented 8 years ago

Original comment by ashok.ha...@gmail.com on 1 Dec 2011 at 11:44