Norconex / crawlers

Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.
https://opensource.norconex.com/crawlers
Apache License 2.0
183 stars 68 forks source link

Navigation within the API docs #261

Closed liar666 closed 8 years ago

liar666 commented 8 years ago

Hi again,

I encountered some difficulties while navigating the API docs: when browsing the docs for a class in one of the packages from norconex, if an object is referred but is from another package, then there's no link to its own API doc page. This renders the navigation between API pages for objects in different packages a pain.

The exact problem I encountered is the following: I was trying to understand how to write an IHttpDocumentProcessor. So I went to: http://www.norconex.com/collectors/collector-http/latest/apidocs/com/norconex/collector/http/doc/IHttpDocumentProcessor.html

This page refers to "com.norconex.collector.http.doc.HttpDocument" as the manipulated object. I wanted to know what are the available methods to use it. Unfortunately, this object is in another package, so there is no link to its documentation page and you have to find the correct URL by yourself. I finally found it at: http://www.norconex.com/collectors/collector-http/latest/apidocs/com/norconex/collector/http/doc/HttpDocument.html

In the same manner, this new page refers to a "com.norconex.importer.doc.ImporterDocument" object, which is again in another package, so I had to find the following link by myself: http://www.norconex.com/collectors/importer/latest/apidocs/com/norconex/importer/doc/ImporterDocument.html

And once again, you are referred to a "com.norconex.commons.lang.io.CachedInputStream", which is in another package, thus API page is not linked...

This process is very time consuming... It does not help quickly grasp the inner workings of norconex's products and focus on the programming task :{

Would'nt it be possible to use "javadoc" tool directly for all packages, so that it can create the html links between the various elements of the http-collector/importer/etc.?

essiembre commented 8 years ago

For links within the same product, they already work so I could not reproduce your first example (the link to HttpDocument from IHttpDocumentProcessor works for me).

But the link from HttpDocument pointing to ImporterDocument does not work and it definitely should. Thanks for pointing this out.

I believe they use to work just fine, including several links pointing to third party libraries. I will investigate why it is no longer the case.

essiembre commented 8 years ago

Links in Javadocs pointing to other libraries are now working for the HTTP Collector, Collector Core, and Importer Javadocs.