richardwilly98 / es-dms

ES DMS is a collaborative content management framework using ElasticSearch andAngularJS
1 stars 4 forks source link

Searches, search results and categorization #98

Open danilos opened 10 years ago

danilos commented 10 years ago

These are ideas and comments for the development of search, result analysis and categorization capabilities in ES-DMS.

Searches and results organization:

It would be convenient to divide the search result screen into 3 panes:


  1. a search pane - with current search capabilities

  2. a tags pane - with search result word cloud tagging

  3. a categorization pane

Search pane

should provide standard search capabilities – as currently implemented. I would apply the following modifications to the current implementation:


  1. I would pre-fill the search criteria and perform a search automatically when a tab switch is performed from another tab to this tab.


Tags pane

I would call this tag pane something else like Analyze and it would provide word cloud categorization and clustering as currently implemented. I would perform the following modifications to the tab:


  1. a search field similar to the one present in the search pane should be added

  2. While the word cloud map is displayed a search result list is not. To be consistent with the operations of the search pane, whereby when no special tag is selected all results are shown, the tags pane should perform the search automatically and display all results.

  3. the search criteria should be pre-filled and a search automatically when a tab switch is performed from another tab to this tab.

Categorization pane

the categorization pane should allow for quick and friendly categorization of the results. Several categorization methods are possible the following is suggested:


  1. Results for a specific search should be listed, together with all available tags. Each document in the list should have a check box

  2. New tags could be added to the tag list or to a document individually. All tags added to documents should be added to the tag list automatically.

  3. a user could select a number of tags and a number of documents and select a button “apply tags” which would add all selected tags to all selected documents.

  4. a search field similar to the one present in the search pane should be added

  5. he search criteria should be pre-filled and a search automatically when a tab switch is performed from another tab to this tab.


The resulting tab panes would be: Search, Analyze and Categorize.

Categorization

Word cloud uses tagging for word frequency analysis, we should look into implementing a content based word frequency analysis.


Categorization during upload could be aided by the following mechanism:

  1. content based word frequency analysis

  2. location context

  3. Application of manual tagging


these could be optional mechanisms made available during upload.


Content based word frequency analysis

content based word frequency analysis would consist of reading and computing word frequency during upload. N most frequent words and their frequency count could be stored.


Location context

Location context would consist of the location of the document file within the import. This could be applied to folder or zipped content. File hierarchy is very often a categorization that is frequently lost when transferring data from file systems to ECMs. It could be possible and convenient for users to add an option to preserve some information about this categorization.


Application of manual tagging

Application of manual tagging would consist simply of adding a selectable tag list to the upload file dialogue. Through the tag list a user could select or add the tags to be applied to the uploaded documents and documents would be tagged with the selected tags while uploading