Norconex / collector-filesystem

Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search engines.
http://www.norconex.com/collectors/collector-filesystem/
21 stars 13 forks source link

Custom meta data fetcher fetching metadata from external properties file #21

Closed jayjamba closed 6 years ago

jayjamba commented 6 years ago

MyMetadataFetcher.zip Hi, Is there a way to fetch meta data from external properties file. We have integrated norconex in spring boot application, I declared my own custom class as Spring component , it extends "GenericFileMetadataFetcher" class. Not sure how can I fetch meta data from external property file ?

Below is my sample metadata fetcher tag that I declared in sample config xml:

If I hardcode the metadata in the above class its getting added in xml files which are generated, but I dont want to hardcode these metadata properties, instead I want it to add these metadata from the property file. I have attached 'MyMetadataFetcher' class

jayjamba commented 6 years ago
@SpringBootApplication
public class FilesystemconnnectorApplication implements CommandLineRunner {

    // @Autowired
    // private MyMetadataFetcher customFetcher;

    @Autowired
    private ApplicationProperties applicationProperties;

    public static void main(String[] args) {
        SpringApplication.run(FilesystemconnnectorApplication.class, args);
    }

    @Override
    public void run(String... args) throws Exception {
        final File configFile = new File(getClass().getClassLoader().getResource("sample-config.xml").getFile());
        final File variableFile = new File(getClass().getClassLoader().getResource("sample-config.variables").getFile());

        final CollectorConfigLoader collectorConfigLoader = new CollectorConfigLoader(FilesystemCollectorConfig.class);
        final FilesystemCollectorConfig fileCollectorConfig = (FilesystemCollectorConfig) collectorConfigLoader
                .loadCollectorConfig(configFile, variableFile);
        final FilesystemCollector collector = new FilesystemCollector(fileCollectorConfig);

        collector.start(true);
    }
}

This is the way my spring boot app starts.

jayjamba commented 6 years ago

I achieved this using ConstantTagger in preParseHandlers. But I was just wondering How can I read the external properties from my own MyMetadataFetcher ?

Below is my constant tagger that worked

<importer>
            <preParseHandlers>
                <tagger class="com.norconex.importer.handler.tagger.impl.ConstantTagger">
                    <constant name="prop1">${prop1}</constant>
                    <constant name="prop2">${prop2}</constant>
            </tagger>
            </preParseHandlers>
</importer>
essiembre commented 6 years ago

To be clear: you want to read a .properties file and add its content to a document metadata via your code?

Have you tried creating your ApplicationProperties directly (or a Properties or ResourceBundle instance) as a test? If that works, it would point to an auto-wiring issue with Spring.

jayjamba commented 6 years ago

Hi, Sorry for late reply I didn't got a chance to test this with application properties file. But can you lemme know if there is way to add custom metadata for the deleted file ?

essiembre commented 6 years ago

Sorry, but I do not understand what you are trying to do. Could you please elaborate?

Are you talking about files sent to your committer for deletion? If so, you can write your own committer and add what you want there.

jayjamba commented 6 years ago

You got that right. Thanks! Writing my own committer did the job.

essiembre commented 6 years ago

Great. Thanks for confirming.