Closed fabriziofortino closed 9 years ago
Hi,
I'd say adding comparators and cleaners as separate parts would be good too. They are used by duke's core but could evolve separately.
Cheers, Yann Le 8 avr. 2014 18:40, "Fabrizio Fortino" notifications@github.com a écrit :
This is a proposal to change the overall project structure creating separate modules (the change has been initially discussed in #147https://github.com/larsga/Duke/issues/147). The project structure will be something like this:
- / (project root, contains the parent pom file)
- /duke-core (core module, contains the core logic)
- /duke-lucene (depends from core, contains the lucene db implementation)
- /duke-mapdb (depends from core, contains the mapdb db implementation)
- /duke-es (depends from core, contains the elasticsearch db implementation). This is almost ready, I am working on it.
- /duke-server (depends from core, contains the server implementation)
New sub-modules (an example could be a python cleaner/comparator interpreter discussed #147 https://github.com/larsga/Duke/issues/147) could be easily added in a separate module.
This change implies that the default database will not be lucene anymore (can we set the InMemoryDatabase as the default one?). I would not keep the lucene dependency in the core because other modules (eg: duke-es) or projects using duke would indirectly use different version of lucene underneath.
WDYT? Can we create a separate branch to work on this?
Reply to this email directly or view it on GitHubhttps://github.com/larsga/Duke/issues/150 .
Hi Guys,
I'd like to add /duke-solr then (solr as underlying database), after I'll make the code better. So far I made it working in commercial project I'm working on.
Best regards, Kamil
2014-04-08 18:44 GMT+02:00 Yann Barraud notifications@github.com:
Hi,
I'd say adding comparators and cleaners as separate parts would be good too. They are used by duke's core but could evolve separately.
Cheers, Yann Le 8 avr. 2014 18:40, "Fabrizio Fortino" notifications@github.com a écrit :
This is a proposal to change the overall project structure creating separate modules (the change has been initially discussed in #147< https://github.com/larsga/Duke/issues/147>). The project structure will be something like this:
- / (project root, contains the parent pom file)
- /duke-core (core module, contains the core logic)
- /duke-lucene (depends from core, contains the lucene db implementation)
- /duke-mapdb (depends from core, contains the mapdb db implementation)
- /duke-es (depends from core, contains the elasticsearch db implementation). This is almost ready, I am working on it.
- /duke-server (depends from core, contains the server implementation)
New sub-modules (an example could be a python cleaner/comparator interpreter discussed #147 https://github.com/larsga/Duke/issues/147) could be easily added in a separate module.
This change implies that the default database will not be lucene anymore (can we set the InMemoryDatabase as the default one?). I would not keep the lucene dependency in the core because other modules (eg: duke-es) or projects using duke would indirectly use different version of lucene underneath.
WDYT? Can we create a separate branch to work on this?
Reply to this email directly or view it on GitHub< https://github.com/larsga/Duke/issues/150> .
Reply to this email directly or view it on GitHubhttps://github.com/larsga/Duke/issues/150#issuecomment-39871965 .
This sounds good. A separate /duke-solr sounds good, too, Kamil.
I think I want to keep the comparators and cleaners in the core, at leas the basic ones that don't have any dependencies. This to limit the number of artifacts that people need to deal with. More fancy comparators and cleaners that require extra modules could then be split out.
Separate modules for lucene and mapdb make sense. Having a new default db seems like the best way to go. Either InMemory or the KeyValue database would be logical choices. I guess I'll just pick one of them and we can refine the choice later. Ideally we could use KeyValue with q-gram indexing to get reasonably fast fuzzy matching. That would make for a good default choice.
As for branches I think I can just sit down and do this in one fell swoop.
@larsga Do you have any update / ETA on this? Thanks
Any updates about duke-solr?
Hi,
I'm looking for updated on this duke "module split" issue. Is there a started effort somewhere ? Could we try to create a branch to start refactoring all code ?
Regards,
Sorry about the silence on this one. Just way too much going on. I plan to do this over Christmas.
Hi everyone!
I've started working providing a modular structure for Duke. You can find my Work In Progress in dictanova/duke repository.
Regards,
This is a proposal to change the overall project structure creating separate modules (the change has been initially discussed in #147 ). The project structure will be something like this:
New sub-modules (an example could be a python cleaner/comparator interpreter discussed #147) could be easily added in a separate module.
This change implies that the default database will not be lucene anymore (can we set the InMemoryDatabase as the default one?). I would not keep the lucene dependency in the core because other modules (eg: duke-es) or projects using duke would indirectly use different version of lucene underneath.
WDYT? Can we create a separate branch to work on this?