sawantuday / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

Add a possiblity to use Factory for instantiating new WebCrawlers, instead of hardcoded usage of class.newInstance() #144

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Doing this will allow usage of for example Spring instantiated objects inside 
Crawler4J.

I attach zip with 2 classes, that can be used to patch crawler4J project.

Original issue reported on code.google.com by emo.ge...@gmail.com on 5 Apr 2012 at 10:24

Attachments:

GoogleCodeExporter commented 9 years ago
That sounds a little like my problem altough I didn't did into crawler4j source 
code.

I built a system (unfortuantely) using inheritance.
One of my subclasses needs to deal with pages coming from the crawler.
As it is already a subclass, it cannot inherit from WebCrawler.

I learned the lesson to avoid inheritance in Java and rather use interfaces and 
composition.

I would like to suggest turning WebCrawler into an interface.
I would also like to suggest, to add another start() method to the controller 
which accepts an object rather than a class. That should help with dependency 
injection.

Original comment by alexande...@gmail.com on 5 Sep 2012 at 3:16

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Does anyone have a sample on how to use spring with this patch?

Original comment by gumat...@gmail.com on 1 May 2013 at 12:37

GoogleCodeExporter commented 9 years ago
for anyone wondering...

in spring-config.xml:

...
<context:annotation-config />
<context:component-scan base-package="com.my.project" />
<bean id="com.my.project.MyObject" class="com.my.project.MyObject" />
<bean id="com.my.project.MySpider" class="com.my.project.MySpider" />
<bean id="com.my.project.MyCrawlerFactory" 
class="com.my.project.MyCrawlerFactory" />
...

MyCrawlerFactory.java:

public class OTSONYCrawlerFactory implements IOTSONYCrawlerFactory {

    @Autowired
    private MySpider mySpider;

    @Override
    public <T extends WebCrawler> T createCrawlerInstance() {
        return (T) mySpider;
    }

}

MySpider.java:

public class MySpider extends WebCrawler {

        @Autowired
    private MyObject myObject;

...

}

MyController.java:

public class MyController {

        private static final String SPRING_CONFIG_FILE = "com/my/project/spring.xml";
    private static final String SPRING_BEAN_MYCRAWLERFACTORY = "com.my.project.MyCrawlerFactory";

    public static void main(String[] args) throws Exception {
ApplicationContext oContext = new 
ClassPathXmlApplicationContext(SPRING_CONFIG_FILE);
        MyCrawlerFactory myCrawlerFactory = (IOTSONYCrawlerFactory) oContext.getBean( SPRING_BEAN_MYCRAWLERFACTORY);

...
// all that crawler4j setup + your business logic
...

controller.start(myCrawlerFactory, numberOfCrawlers);

        }

}

--

I think thats it, hopefully I didn't forget anything :)

Original comment by gumat...@gmail.com on 1 May 2013 at 4:30