Property accesses are not documented clearly

GoogleCodeExporter commented 8 years ago

What steps will reproduce the problem?

1. Write a Connector that delivers a Document with a Search URL.

What is the expected output? What do you see instead?

The Connector Manager should send the Document as a URL Feed with meta
data. But we se an exception: Search URL null is malformed. This is caused
by a mistake in the Connector Manager code (see below).

What version of the product are you using? On what operating system?

GSA version 5.0.0
Connector Manager 1.0.3 revision 763

Please provide any additional information below.

As you know the structure of documents, defined in the SPI comes with a
little problem. If you read a Property the values are lost after the call
of nextValue(). In the DocPusher class you read out the search URL to find
out whether the Feed type is XML_FEED_METADATA_AND_URL:

private void setFeedType(Document document) {
  if (getOptionalString(document, SpiConstants.PROPNAME_SEARCHURL) != null)   
  {
    this.feedType = XML_FEED_METADATA_AND_URL;
  } else {
    this.feedType = XML_FEED_INCREMENTAL;
  }
}

After that the document will not have a value for search URL any more! But
after that you do this in DocPusher class:

protected InputStream buildXmlData(Document document, String connectorName) 
{
  ...  
  // build record
  String searchurl = null;
  if (this.feedType == XML_FEED_METADATA_AND_URL) {
    searchurl = getOptionalString(document,
    SpiConstants.PROPNAME_SEARCHURL);
    // check that this looks like a URL
    try {
      new URL(searchurl);
    } catch (MalformedURLException e) {
      LOGGER.warning("Supplied search url " + searchurl
                       + " is malformed: " + e.getMessage());
      return null;
    }
  } 
  ...
}

So There will always be the exception "search URL null is malformed". One
way of solving this problem is to add the value in the search URL twice.
But of course thats not good style. So you may hold the value of display
url in the DocPusher and avoid a second call of getOptionalString(document,
SpiConstants.PROPNAME_SEARCHURL);

Regards

Original issue reported on code.google.com by andree.j...@googlemail.com on 5 May 2008 at 9:40

GoogleCodeExporter commented 8 years ago

Thanks for the detailed report. It sounds like in your implementation of 
Document the property is consumed 
by the call to findProperty. It should not be. The Connector Manager is free to 
retrieve properties multiple 
times. 

The SharePoint connector uses metadata and URL feeds, so it can't be as 
completely broken as you describe.

There is a comment in the API documentation for the Document class that 
"Important: a Property object 
obtained by calling findProperty(String) is invalidated by the next call to 
findProperty(String)." This is not a 
requirement on the connector implementation, but on the Connector Manager. 
There is a similar restriction 
for DocumentList. Together, they are intended to allow the connector 
implementation to implement the 
DocumentList, Document and Property interfaces in the same class, and avoid 
extraneous instances of the 
Document and Property implementation classes. It's a restriction on the 
Connector Manager that the Property 
object returned by findProperty must not be used again after the next call to 
findProperty, because it may now 
refer to a different property. The underlying Document property should still 
exist.

I don't think that any of the core connectors actually take advantage of this 
provision. They all have Document 
implementations that contain the properties in some repository-specific manner, 
and then create new 
Property instances for each findProperty call. For examples, see

http://code.google.com/p/google-enterprise-connector-
manager/source/browse/trunk/projects/connector-
manager/source/javatests/com/google/enterprise/connector/jcr/JcrDocument.java

in the core Connector Manager project, or the SharePoint implementation at

http://code.google.com/p/google-enterprise-connector-
sharepoint/source/browse/trunk/projects/sharepoint/source/java/com/google/enterp
rise/connector/sharep
oint/client/SPDocument.java

Unless you have more details, or a different explanation of the problem, I'm 
going to mark this bug as invalid.

Original comment by jl1615@gmail.com on 6 May 2008 at 12:20

GoogleCodeExporter commented 8 years ago

[deleted comment]

GoogleCodeExporter commented 8 years ago

Thanks for the fast reply! I have directly stored the document properties in my 
SPI
implementation. So the Properties are delivered call by reference and the SPI 
value
object is lost, after the call of nextValue();
I understood the way to avoid this issue after having a look at the 
JcrDocument. I
will change my implementation. As I suppose that other people may make the same
mistakes you may add some comment to the API documentation for the Document.
In my opinion something like "deliver call by value" would be more easy to 
understand.

Original comment by andree.j...@googlemail.com on 6 May 2008 at 9:51

GoogleCodeExporter commented 8 years ago

I'll accept this as a documentation bug. I agree that the requirement for 
multiple accesses should be more clear.

Original comment by jl1615@gmail.com on 6 May 2008 at 6:19

Changed title: Property accesses are not documented clearly
Changed state: Accepted
Added labels: Component-Docs

GoogleCodeExporter commented 8 years ago

Original comment by jl1615@gmail.com on 6 May 2008 at 10:22

GoogleCodeExporter commented 8 years ago

We should also document more clearly that findProperty might be called with 
property names that are not 
returned by getPropertyNames, and that null should be returned in that case, 
rather than throwing an exception. 
Alternatively, we could modify the connector manager to check the property 
names before calling findProperty, 
but I don't like that as much.

Original comment by jl1615@gmail.com on 15 Jul 2008 at 7:28

GoogleCodeExporter commented 8 years ago

Original comment by Brett.Mi...@gmail.com on 19 Nov 2008 at 12:44

Changed state: Fixed

GoogleCodeExporter commented 8 years ago

Original comment by jl1615@gmail.com on 12 Jan 2009 at 3:47

Added labels: Milestone-Release_1.3.0

PILLUTLAAVINASH / google-enterprise-connector-manager

Property accesses are not documented clearly #85