Closed johnandersen777 closed 3 years ago
Hi @pdxjohnny
According to our input validation check for linear regression, the number of rows in the batch should be greater or equal number of columns + (int)(parameter->interceptFlag == true)
. In your case training by two objects per batch should work.
This code works in my case for daal4py 2020.1
import daal4py, pandas
import numpy as np
lm = daal4py.linear_regression_training(
interceptFlag=True, streaming=True
)
for x, y in [
([0.0, 0.0], [0, 0]),
([0.1, 0.1], [0, 0]),
]:
feature_data = {"x": x, "y": y}
print(feature_data)
print()
df = pandas.DataFrame(feature_data, index=[0, 1])
print(df)
print()
xdata = df.drop(["y"], 1)
ydata = df["y"]
print("xdata", type(xdata), repr(xdata))
print()
print("ydata", type(ydata), repr(ydata))
print()
print()
lm.compute(xdata, ydata)
lm.finalize()
What happens if our dataset has an uneven number of records? Do we have to feed it through twice? Can we have the old behavior back please? or is there some reason why we can't or shouldn't have a batch size of 1
Update on this. I think the issue is really with the daal
conda package version 2020.2
, rather than daal4py
https://github.com/intel/dffml/commit/c239118c4027a360043a1d9e15fd12633bcf095e
@pdxjohnny Thanks so much for pointing out this issue. I created a Pull request with a fix: https://github.com/oneapi-src/oneDAL/pull/764
Hi @pdxjohnny PR was merged successfully. All changes will be available in oneDAL 2021.2
@pdxjohnny Is there still a problem with latest version daal4py?
If will be new problems, reopen the issue
The updated
daal
2020.2 release a few days ago resulted in a pandas objects no longer being accepted by the compute method. The following code worked on 2020.1, this is a simplified example taken from: https://github.com/intel/dffml/blob/63b490a5b6402dcb770072f75b3b665e433525f3/model/daal4py/dffml_model_daal4py/daal4pylr.py#L44-L78CI logs: https://github.com/intel/dffml/runs/896438744?check_suite_focus=true
Output (this had been run within one of our TestCases, I've pulled it out of that method into the above code):