Closed yidong72 closed 4 years ago
Please update the changelog in order to start CI tests.
View the gpuCI docs here.
checked the unit tests, flake8 checks. all the notebooks working fine except the customized nodes one, the Dask give some error.
The remaining two notebooks are fixed. Ready for review.
added a unit test to make sure the nodes compute the consistent results.
All the updates look good.
I found one bug that was not due to any of these changes, but I'd like to get it fixed. The bug is how to setup inputs for nodes without ports (this is my fault, I introduced the bug when adding ports API). The order of inputs could be incorrect for non-port nodes that have multiple inputs. I discovered the bug while re-running the mortgage example.
file: "<>/gquant/dataframe_flow/_node_flow.py" method:
__call__
Lines 696-697:inputs = [self.__make_copy(data_input) for data_input in inputs_data.values()]
The above code is wrong but my initial solution was incorrect too.
There's a bug in my fix. I'm working on figuring it out. Sorry.
I'm testing again. The fix should be that the inputs should be setup only when self.load
is not set Change to code below:
def __call__(self, inputs_data):
if self.load:
if isinstance(self.load, bool):
output_df = self.load_cache()
else:
output_df = self.load
else:
if self._using_ports():
# nodes with ports take dictionary as inputs
inputs = {iport: self.__make_copy(data_input)
for iport, data_input in inputs_data.items()}
else:
# nodes without ports take list as inputs
inputs = [self.__make_copy(inputs_data[ient['to_port']])
for ient in self.inputs]
. . . the rest of the code
did the change
Improved the CSV file loading time from 89s to 3s. fixed a few API behaviors changes