Norconex / crawlers

Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.
https://opensource.norconex.com/crawlers
Apache License 2.0
183 stars 67 forks source link

CrawlerEventListener configuration Question #656

Closed LeMoussel closed 4 years ago

LeMoussel commented 4 years ago

I implementing this :

import java.io.FileWriter;
import java.io.IOException;

import com.norconex.collector.core.CollectorException;
import com.norconex.collector.core.crawler.ICrawler;
import com.norconex.collector.core.crawler.event.CrawlerEvent;
import com.norconex.collector.core.crawler.event.ICrawlerEventListener;

public class MyCrawlerEventListener implements ICrawlerEventListener {
  private String outputFile;

  @Override
  public void crawlerEvent(final ICrawler crawler, final CrawlerEvent event) {
    final String type = event.getEventType();

    // Create new file on crawler start
    if (CrawlerEvent.CRAWLER_STARTED.equals(type)) {
      writeLine("Crawler Start", false);
      return;
    }

    // Do some stuff .....
  }

  private void writeLine(final String message, final boolean append) {
    try (FileWriter out = new FileWriter(outputFile, append)) {
      out.write(message);
      out.write('\n');
    } catch (final IOException e) {
      throw new CollectorException("Cannot write bad link to file.", e);
    }
  }
}

with this config :

    <crawlerListeners>
      <listener class="MyCrawlerEventListener">
        <outputFile>test.tsv</outputFile>
      </listener>
    </crawlerListeners>

Variable outputFile is not filled (always null value). It should have the value test.tsv

I don't understand why .....

essiembre commented 4 years ago

The mapping is not automatic. Try having your listener also implement IXMLConfigurable and add somethign like this:

    @Override
    public void loadFromXML(Reader in) throws IOException {
        XMLConfiguration xml = XMLConfigurationUtil.newXMLConfiguration(in);
        outputFile = xml.getString("outputFile", outputFile));
    }

    @Override
    public void saveToXML(Writer out) throws IOException {
        // You can leave this method blank if you have no use for savign back the XML
        try {
            EnhancedXMLStreamWriter writer = new EnhancedXMLStreamWriter(out);
            writer.writeStartElement("listener");
            writer.writeAttribute("class", getClass().getCanonicalName());
            writer.writeElementString("outputFile", outputFile);
            writer.writeEndElement();
            writer.flush();
            writer.close();
        } catch (XMLStreamException e) {
            throw new IOException("Cannot save as XML.", e);
        }        
    }
LeMoussel commented 4 years ago

OK. Thank you Pascal for your help.