Jul 08, 2013 - Searchdaimon is an open source search engine for corporate data and websites. It aims to be as simple to use as your favorite Internet search engine, yet has the added power of delivering results from numerous systems with standardized attribute navigation.
It comes with a powerful administrator interface and can index websites and several common enterprise systems like SharePoint, Exchange, SQL databases, Windows file shares etc. The ES supports many data sources (e.g., Word, PDF, Excel) and the possibility of faceted search, attribute navigation and collection sorting.
You can setup your development environment directly on the ES with little effort. Then tweak and change what you want.
More information about the product and screenshots are available at the bottom of this page.
We have released all our source code related to the Searchdaimon enterprise search engine to GitHub, with full commit history. We will from now on be using this GitHub repository internally as our only source code repository. It is currently about 100K lines of code. These is the full source code, tools and build chain to build the ES.
The only exception is that we unfortunately have had to hold on to some related source code that uses proprietary code from 3-partys. However none of thus are essential, and binaries are made freely available for most of it. A list of non-released code is available at: Statement on non-open source code.
The ES software is currently about 200K lines of code and is mostly written in C and Perl.
C: 72.00% Perl: 18.79% Yacc: 5.93% Lex: 2.58% sh: 0.35% Java: 0.16% C++: 0.13% AWK: 0.04%
More statistics is available at our Ohloh page.
It is super easy to get started to develop on the ES. As the ES comes as a virtual machine with all software installed, all you need to do to get a development environment up and running is to add the development plugin and then checkout the source code with git.
Please see full article on how to get started as a developer at: Setting up an ES for development
http://lists.searchdaimon.com/mailman/listinfo/sddev
Some functions should be developed as plugins so other parts of the system can be changed without interfering. The preferred whey is to use Perl as described in the “Plugins” section at https://github.com/searchdaimon/enterprise-search/wiki/Overview-of-the-code .
The ES is licensed under the GNU General Public License version 2 (GPLv2).
Please see the official manual for information on how to setup and use the ES as a system administrator or end user: http://www.searchdaimon.com/documentation/
Talk with the ES staff, users and other developers at http://www.searchdaimon.com/forum/ .
You can restrict your search to type of document, data source, date and meta-information such as contacts, customers, sales and projects. You can also sort on date or relevancy.
In the above picture, you can see the results of the query "enterprise search". The search has been further broken down to only include documents from the "Sales" collection. You can also filter the search to only include documents from a file type like Excel or PowerPoint, or from a date interval like this year or older than two years.
Structured data can be presented as a table among the results. Often you'll see the information you need in the user interface, without having to open the data source itself.
Above you see a search which gives hits in a holiday schedule. The schedule is a structurized list in SharePoint. We can see that Runar Buvik is entered for a trip to the Netherlands from 07-11-2009 to 07-25-2009.
The ES suggests query words while you are writing. The words proposed are fetched from documents the user has access to, so that domain and product names, which you can't find in traditional dictionaries, are included.
The ES can propose correctly spelled words if you have misspelled a word. For example correcting "enterprie serch" to enterprise search. As for Suggest, the dictionary is built from indexed documents.
Searching for "car" also shows documents containing "cars", etc.
The ES comes with a strong intuitive web interface for administration.
Show status and allow you to configure your collections. Collections are grouped by type of crawler (SMB, Exchange etc.).
View aggregated and raw data to see which users are the most active, what the most popular queries are, and how many searches are performed every day.