flipkart-incubator / databuilderframework

A data driven execution engine
34 stars 29 forks source link

Proper way to access dataSet from DataBuilderContext #39

Open chaitanyachavali opened 4 years ago

chaitanyachavali commented 4 years ago

As of now there are two methods by which we can get dataset from DataBuilderContext

While implementing DataBuilder#process(DataBuilderContext context) we would need to get dataset and isn't using context.getDataSet() is the correct way to access? (If yes then why is it marked as deprecated?) I don't see a clean way to use getDataSet(DataBuilder builder) inside process, if I do getDataSet(this) there will a NPE as there won't be any dataBuilderMeta set with current instance

getDataSet(DataBuilder builder) enforces to use only data classes mentioned in consumes, optional and access anyways this enforcement is already happening from executors when calling process. Ref:

Either we have to set dataBuilderMeta when processed in withDataBuilder here something like dataBuilder.setDataBuilderMeta(dataBuilderMeta) which set's it to dataBuilder instance and can be accessed when processing or make other non-deprecated or I should be missing something đŸ˜…

r0goyal commented 4 years ago

@Chaitanyachavali You can use DataBuilderFactory for setting up relevant metadata for a builder instance automatically.

There are two inbuilt implementations of DataBuilderFactory -

Sample code for creating an executor

DataBuilderMetadataManager dataBuilderMetadataManager = new DataBuilderMetadataManager();
/*
Write code to register all your builder classes here by invoking dataBuilderMetadataManager.register()
/*

DataBuilderFactory factory = new InstantiatingDataBuilderFactory(dataBuilderMetadataManager);
DataFlowExecutor executor = new DataFlowExecutor(factory);

Will check on getDataSet() and getDataSet(DataBuilder builder) and get back.

chaitanyachavali commented 4 years ago

@r0goyal Thanks for the quick reply!

I am not sure if I am following you correctly, please bear me here :) > You can use DataBuilderFactory for setting up relevant metadata for a builder instance automatically Yes, by factory.create(DataBuilderMeta dataBuilderMeta) we can get a builder instance with metadata but how would I build a DataBuilderMeta while prepping a DataFlow? If I do it by Utils.meta wouldn't that be an overhead as the same would be done again while building dataflow.

From sample code what I understood is that I can create my own manager and register all the databuilders and pass on via factory to the executor and executor can get a databuilder with the metadata from the factory instantiated DataBuilder. This is great, I see this is much relevant to engine side of it and will help very much in cases like when I wanted to override the annotation processed metadata and so on, I am more interested with databuilder.process and the dataflow building.

Sorry I got it wrong previously, went through factory usecases in dataflow. Since default is MixedDataBuilderFactory for dataflow and the instance passed is dataflow doesn't have metadata getDataSet(this) was having issues, with InstantiatingDataBuilderFactory this would be fixed as it builds a new builder instance setting metadata. Also would love to know your comments on getDataSet functionality as this is very common use case and also registering all databuilders would be a heavy :P

Note: If dataflow has dataBuilderFactory and we pass a custom factory to executor, dataBuilderFactory from dataflow is given priority. In DataFlowBuilder by default factory is set to MixedDataBuilderFactory, so if dataflow is built using dataFlowBuilder then we would need to set factory explicitly after instance being built. Would also be great if we can have withDataBuilderFactory in DataFlowBuilder. Thoughts?

chaitanyachavali commented 4 years ago

@r0goyal Any update here?