datacleaner / DataCleaner

The premier open source Data Quality solution
GNU Lesser General Public License v3.0
599 stars 181 forks source link

Remote transformers - Hide / publish transformers, configuration of default properties. #659

Closed khouzvicka closed 7 years ago

khouzvicka commented 9 years ago

I want to add settings for the remote transformers to the configuration file. Admin can specify the list of published (included) remote transformers by their names. The rest of the transformers will be hidden. Or he can specify the hidden list (excluded) and the rest will be allowed. Only one element can be here, includes or excludes. In the second part of the configuration there will be default property values for each remote transformer.

New part of configuration file:

<published-components>
    <includes>
        <include name="Address Correction"/>
    </includes>

    <excludes>
       <exclude name="Address Correction"/>
    </excludes>

    <defaults>
        <component name="Address Correction">
            <properties>
                <property name="Connection">ilona://localhost:4711/TlNMb2NhdG9y/</property>
            </properties>
        </component>
    </defaults>
</published-components>
kaspersorensen commented 9 years ago

Suggestion from my side on the design:

khouzvicka commented 9 years ago

Hi Kasper. I am implementing this feature and I have question. I want to add the configuration to conf.xml. Each tenant has one specific configuration (conf.xml). Is there any global configuration file for this function? Or can this be tenant dependent?

kaspersorensen commented 9 years ago

Hi Karel,

I don't think this is good to keep in conf.xml actually, because it's quite specific to the monitor configuration whereas conf.xml is used in so many places where this stuff would not make sense.

To begin with I would suggest to add this in our spring xml files. In the community/open codebase we would just have a dummy bean. In commercial edition we have ways to override this default/dummy bean with something smarter. And to provide input to it via datacleaner-monitor.properties. That would be my preferred way of configuring it.

jakubneubauer commented 9 years ago

Hi Kasper, just to be sure we understand each other: injecting the "simple" implementation in community edition and a "smarter" implemenation in professional edition is clear. The question is how to actually configure this smarter implemenation. We are talking about the functionality we originally proposed as an XML snippet (see the first comment for this issue). Do you really want to configure this inside the application context xml? Is it the natural way for app. admin?

jakubneubauer commented 9 years ago

Oh, now I see your proposal of configuring it via .properties. Really?

kaspersorensen commented 9 years ago

Yes that was my thought ... But maybe too hard to configure it that way? If so we should maybe consider an optional spring beans xml file in the ${user.home}/.datacleaner directory. Let's consider that in the commercial repo though :)

jakubneubauer commented 9 years ago

So then I would recommend extension to the Spring XML format ( http://docs.spring.io/spring/docs/4.1.x/spring-framework-reference/html/extensible-xml.html ) with the format we already proposed.

kaspersorensen commented 9 years ago

ok that sounds OK to me. But a point from my side is that it would be good to keep this file also to provide other beans. For instance we right now have a bit of a lame configuration step for customers who want to reconfigure their repository implementation. This could eventually be a way to add other beans as well.

khouzvicka commented 9 years ago

Configuration is implemented in new xml context. remote-config-context.xml

jakubneubauer commented 9 years ago

I think now it is ready - in community edition, there is only support for components configuration through an interface, with simple no-config implementaiton, allowing all components in REST. The commercial edition implements it with more advanced features. (The downside of this solution is that with community edition there is no way how to specify default properties values for components published via REST API.)

kaspersorensen commented 7 years ago

No longer relevant since DataCloud is not in community edition.