Open amotl opened 1 year ago
Disclaimer: I was just reading the traceback and quickly poked a bit at the code of MindsDB, nothing else yet.
Initially, in 2021, support for CrateDB has been added on behalf of a datasource
module ^3 with https://github.com/mindsdb/datasources/pull/58.
Then, in 2022, another module supporting CrateDB has been added with https://github.com/mindsdb/mindsdb/pull/2682, which is effectively described and documented as the "CrateDB Integration" ^1, and implemented as a handler
module ^handler-cratedb.
Now, I can't see that this handler module is actually used when looking at the traceback. What is striking is that the lightwood_handler
^handler-lightwood is used instead.
Traceback (most recent call last):
File "/path/to/mindsdb/lib/python3.8/site-packages/mindsdb/integrations/libs/ml_exec_base.py", line 128, in learn_process
ml_handler.create(target, df=training_data_df, args=problem_definition)
File "/path/to/mindsdb/lib/python3.8/site-packages/mindsdb/integrations/handlers/lightwood_handler/lightwood_handler/lightwood_handler.py", line 67, in create
run_learn(
Other than this, I am not sure if the type inferring subsystem may be misguided here, and if that actually causes the error.
INFO:type_infer-6520:Infering type for: saledate
INFO:type_infer-6520:Column saledate has data type categorical
INFO:type_infer-6520:Infering type for: ma
INFO:type_infer-6520:Column ma has data type binary
INFO:type_infer-6520:Infering type for: type
INFO:type_infer-6520:Column type has data type binary
INFO:type_infer-6520:Infering type for: bedrooms
INFO:type_infer-6520:Column bedrooms has data type binary
WARNING:type_infer-6520:Column saledate is an identifier of type "UUID"
WARNING:type_infer-6520:Column bedrooms is an identifier of type "No Information"
I've raised it as an issue in the MindsDB repository.
Based on earlier feedback from the MindsDB team, the KeyError: 'saledate'
is thrown when the data set is too small. Upon increasing the data set, the error changes as posted in the linked issue.
Hi. The upstream issue has been closed as completed, but without any further information. Shall we re-evaluate the situation?
Hi,
I checked the commit history since the last time I looked at it, and I don't see any changes. The closing of the upstream issue probably has to be interpreted as "won't fix", since there are also no linked pull requests or anything else.
Thanks. Can you re-open the issue, and can we discuss it? I would like to use your insights to eventually submit a patch, when applicable. However, I haven't approached the topic yet, just tried to contribute my share by tracking it.
A concise minimal reproducer could help to get closer to the issue, the tutorials referenced above starts a bit too high-level for me, apparently expecting a completed setup already.
Dear @chandrevdw31,
thank you for submitting GH-118. Does that mean MindsDB works well together with CrateDB now?
As you can read from this discussion, we could not derive any clear outcome from https://github.com/mindsdb/mindsdb/issues/5483 ff., and did not re-evaluate the situation yet on our behalves.
With kind regards, Andreas.
Hi @amotl
Thank you for raising this. I will investigate this further with the team and have this matter resolved.
Kind regards
Thank you very much, Chandre. Do you have any news to report about this matter?
/cc @hlcianfagna
Hi there,
@hammerhead recently evaluated the Forecasting Quarterly House Sales with MindsDB tutorial with CrateDB and this Python driver (thanks!), so I would like to report about the outcome.
With kind regards, Andreas.
Report