Alfresco / alfresco-indexer

A custom way to index Alfresco changes.
Apache License 2.0
16 stars 15 forks source link

alfresco-indexer

What is it?

Alfresco Indexer is an API that allows to index content stored in Alfresco, when you want, how you want, selecting the content you're interested to.

Compatibility Matrix

Alfresco Indexer version (shipped with) ManifoldCF version (tested wth) Alfresco edition/version
0.7.x 1.8.0 to 2.2.0-RC0 Community 5.0.[a,b,c,d], Enterprise 4.2.x
0.8.x trunk (master) - WIP Community 5.0.d, Enterprise 5.0.x

Community 5.1.[a,b,c]-EA is work in progress (add issue link) There may be other permutations that work but haven't been tested.

Run Tests

git clone git@github.com:maoo/alfresco-indexer.git
mvn clean install -DskipTests
cd alfresco-indexer-webscripts-war
mvn clean integration-test

To know how to build the master and test it against ManifoldCF, follow these instructions

Project Structure

/**
* Fetches nodes from Alfresco which has changed since the provided timestamp.
*
* @param lastAclChangesetId
*         the id of the last ACL changeset already being indexed; it can be considered a "startFrom" param
* @param lastTransactionId
*         the id of the last transaction already being indexed; it can be considered a "startFrom" param
* @return an {@link AlfrescoResponse}
*/
AlfrescoResponse fetchNodes(long lastTransactionId, long lastAclChangesetId, AlfrescoFilters filters) throws
AlfrescoDownException;

/**
* Fetches Node Info from Alfresco for a given node.
* @param nodeUuid the UUID for the node
* @return an {@link AlfrescoResponse}
* @throws AlfrescoDownException
*/
AlfrescoResponse fetchNode(String nodeUuid) throws AlfrescoDownException;

/**
* Fetches metadata from Alfresco for a given node.
* @param nodeUuid
*        the UUID for the node
* @return a map with metadata created from a json object
*/
Map<String, Object> fetchMetadata(String nodeUuid) throws AlfrescoDownException;

Differences with Alfresco-Solr integration

The software architecture of Alfresco Indexer is the same delivered by Alfresco-Solr integration:

Nevertheless, the following differences can be noted:

To summarise, advantages of using Alfresco Indexer:

Disadvantages of using Alfresco Indexer:

Configuration

Alfresco Indexer Webscripts can be configured to tweak the indexing process; in alfresco-global.properties you can override the following default parameters.

Url Prefixes

indexer.properties.url.prefix = http://localhost:8080/alfresco/service/node/details
indexer.document.url.prefix = http://localhost:8080/alfresco/service/slingshot/node
indexer.content.url.prefix = http://localhost:8080/alfresco/service
indexer.share.url.prefix = http://localhost:8888/share
indexer.preview.url.prefix = http://localhost:8080/alfresco/service
indexer.thumbnail.url.prefix = http://localhost:8080/alfresco/service

Node Changes paging parameters

indexer.changes.nodesperacl=10
indexer.changes.nodespertxn=10

Node Changes allowed Node Types (whitelist)

indexer.changes.allowedTypes={http://www.alfresco.org/model/content/1.0}content,{http://www.alfresco.org/model/content/1.0}folder

Other examples of allowed types:

{http://www.alfresco.org/model/forum/1.0}topic
{http://www.alfresco.org/model/forum/1.0}post
{http://www.alfresco.org/model/content/1.0}person
{http://www.alfresco.org/model/content/1.0}link
{http://www.alfresco.org/model/calendar}calendar
{http://www.alfresco.org/model/calendar}calendarEvent
{http://www.alfresco.org/model/datalist/1.0}dataList
{http://www.alfresco.org/model/datalist/1.0}dataListItem (includes all sub-types, such as dl:task, dl:event and dl:issue)
{http://www.alfresco.org/model/blogintegration/1.0}blogDetails
{http://www.alfresco.org/model/blogintegration/1.0}blogPost

Binaries

Alfresco Indexer binaries can be found in Maven Central; you can use Alfresco Indexer using Apache Maven, simply adding the following dependency in your pom.xml file:

  <dependency>
      <groupId>com.github.maoo.indexer</groupId>
      <artifactId>alfresco-indexer-client</artifactId>
      <version>0.8.0</version>
  </dependency>

Release

Before releasing, make sure you can upload artifacts to Maven Central:

mvn deploy -Pgpg

If everything goes fine, make sure you're up-to-date with git master and run the release command:

git status
netstat -anl | grep 8080 #make sure local port 8080 is free
mvn clean -Ppurge
mvn release:prepare release:perform

Follow sonatype docs for setting up your environment.

Credits

This project was have been developed by

License

Please see the file LICENSE.md for the copyright licensing conditions attached to this codebase