Open touma-I opened 4 days ago
if each transformer builds its own module, should we add init.py files and create a unified namespace?
For example: instead
from dpk_html2parquet.transform import Html2ParquetTransformConfiguration
use
from dpk_html2parquet import Html2ParquetTransformConfiguration
the same for the ray runtime.
If we are doing so massive refactoring, should we combine
test
andtest-data
into one dir, e.g.
test
data
input
expected
@roytman Why not leave it to the transform owner developer to decide if they want to nest the test-data under test. All we care about that we have a test folder for running the pytest. no ? where the developer puts their data is up to them. no ?
if each transformer builds its own module, should we add init.py files and create a unified namespace? For example: instead
from dpk_html2parquet.transform import Html2ParquetTransformConfiguration
usefrom dpk_html2parquet import Html2ParquetTransformConfiguration
the same for the ray runtime.
How did I miss that? Done. Thanks @roytman
Why are these changes needed?
This is a first of a series of restructuring changes that are done to have each transform built as its own module (e.g. dpk_html2parquet) with a ray submodule (dpk_html2parquet.ray ).
Related issue number (if any).
https://github.com/IBM/data-prep-kit/issues/774