datarobot / airflow-provider-datarobot

DataRobot provider for Apache Airflow
https://pypi.org/project/airflow-provider-datarobot/
Other
28 stars 9 forks source link

Elyra Pipelines Integration failing with wheel file #8

Open shalberd opened 1 year ago

shalberd commented 1 year ago

I am using Airflow 2.4.1 and would be interested, in addition to the basic airflow operators Bildschirmfoto 2023-01-06 um 17 58 43

https://airflow.apache.org/docs/apache-airflow/2.4.1/_api/airflow/operators/index.html#

to also use this provider with all its components in Elyra.

In Elyra Pipelines, there is the possibility to add operators via a concept named "Airflow Provider Package Catalog Connector".

https://medium.com/ibm-data-ai/getting-started-with-apache-airflow-operators-in-elyra-aae882f80c4a

However, when I add the wheel-file download url to the Elyra config, I get the following notice in the jupyterlab elyra container:

[E 2023-01-06 16:41:25.198 ElyraApp] Error. Airflow provider package connector 'DataRobot Operator Components for Airflow' is not configured properly. The archive '/tmp/tmpij9e8m_j/airflow_provider_datarobot-0.0.3-py3-none-any.whl' contains 0 file(s) named 'get_provider_info.py'.
[I 2023-01-06 16:41:27.003 SingleUserLabApp log:189] 200 GET /user/kube%3Aadmin/api/kernels?1673023286993 (kube:admin@10.131.0.7) 1.77ms 

@ptitzler @kiersten-stokes

Can you maybe provide the developer with a hint as to what would need to be changed for integration to work?

It would be great to have all those operators available via Elyra Pipelines.

https://pypi.org/project/airflow-provider-datarobot/#description

shalberd commented 1 year ago

Then again, it is possible this does not yet work with providers made for Airflow 2.x, as DataRobot provider is, since for example the Apache Airflow Providers Amazon Wheel file, which has the file get_provider_info.py in its directories, throws yet another error in the Elyra container

[E 2023-01-06 17:10:10.314 SingleUserLabApp component_parser_airflow:56] Content associated with identifier '{'provider_package': 'apache_airflow_providers_amazon-7.0.0-py3-none-any.whl', 'provider': 'apache_airflow_providers_amazon', 'file': 'airflow/providers/amazon/aws/operators/ecs.py'}' could not be parsed: 'Attribute' object has no attribute 'id'. Skipping...
ptitzler commented 1 year ago

Hi @shalberd,

There have been many Apache Airflow changes since Elyra introduced the provider package connector in version 3.6. The changes to provider package implementations are incompatible and our connector is unable to locate the operator classes in the archive. Any provider created for Airflow releases > 2.2 will therefore likely not work.

shalberd commented 1 year ago

@ptitzler are there any plans to change that situation in Elyra, or, in other words, how important within IBM is Airflow still at this point?

ptitzler commented 1 year ago

@shalberd I am no longer involved with the Elyra project and therefore don't have any insights into future plans. Please reach out to the remaining maintainers via the project's public channels.