Open datajoely opened 5 months ago
It is unclear why those logs don't show tracebacks.
Anyway, the current implementation of AbstractDataset
is responsible for that DatasetError
:
They must be in the exc
object somewhere, I refuse to believe otherwise
Thank you @datajoely! Could you please provide some more context on what AWS service was used to run kedro pipeline? We would like to check if the service is filtering the error messages as it seems like we always showcase the entire error log.
I've asked the user to comment here to double check, but I think it was:
Docker image running on AWS ECS
First, I amend my comment above: the traceback is there (File /usr/local/...
).
The problem of AbstractDataset
hiding the real error has been mentioned in other places (https://github.com/kedro-org/kedro/issues/1936#issuecomment-1727172650, https://github.com/kedro-org/kedro/issues/2199#issuecomment-2101008300) although I don't think we have an issue for it (@ElenaKhaustova?). If that's the case, maybe we can keep this issue open?
In #2943 we partly addressed the issue of unclear errors with datasets. Yet we have a bit more evidence about this still being a problem.
The user was getting
Class 'projx.models.audio.io.LargeModel' not found, is this a typo?
but the actual underlying error was:
>>> from projx.models.audio.io import LargeModel
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/app/src/projx/models/audio/__init__.py", line 1, in <module>
from .base import LAM
File "/app/src/projx/models/audio/base.py", line 1, in <module>
from elevenlabs.client import ElevenLabs
ModuleNotFoundError: No module named 'elevenlabs'
Another internal user reported this today.
Description
A user reported that Kedro was unable to read the CSV, they get the following logs in AWS:
The "No columns to parse from file" is being thrown by the underlying pandas implementation in this file
It would be helpful if Kedro could bubble up that the error is thrown in
pandas.io.parsers.python_parser
so that it is clear where the issue lies. The error above, mentionskedro.io.core.DatasetError
is it not possible to do the same?