Open astrojuanlu opened 9 months ago
@rxm7706 do you have any thoughts?
@rxm7706 do you have any thoughts? kedro-datasets is mostly useless without the optional dependencies, but to my knowledge conda packages don't have such a thing. Any prior art on what the best practices are in cases like this?
@astrojuanlu - that is accurate, and I have managed so far by installing only the needed dependencies - based on the plugins I need.
Option 1 We can go independent feedstocks - e.g. kedro-datasets-plotly # https://github.com/kedro-org/kedro-plugins/blob/main/kedro-datasets/pyproject.toml#L25C1-L25C7
But for something like kedro-datasets - which has many variations and will continue to grow ; we will end up with a lot of disconnected feedstocks, and a lot of sequential feedstock updates to maintain.
OpenLineage and OpenTelemetry are examples for this pattern. https://github.com/conda-forge/openlineage-airflow-feedstock/blob/main/recipe/meta.yaml https://github.com/conda-forge/opentelemetry-instrumentation-grpc-feedstock/blob/main/recipe/meta.yaml
Option 2 I would almost rather go the way of Airflow - one recipe many outputs. e.g. https://github.com/conda-forge/airflow-feedstock/blob/main/recipe/meta.yaml#L200
I would recommend option 2, feel free to raise a PR and we can leverage this single feedstock. LMK what you prefer and how I can help.
cc @merelcht @noklam FYI
Comment:
Just learned that there's a feedstock for this package already, thank you!
kedro-datasets is mostly useless without the optional dependencies, but to my knowledge conda packages don't have such a thing. Any prior art on what the best practices are in cases like this?