datacontract / datacontract-cli

CLI to manage your datacontract.yaml files
https://cli.datacontract.com
Other
352 stars 60 forks source link

fix: use dynamic import from importlib #292

Closed teoria closed 2 days ago

teoria commented 3 days ago

this fix solve the import issue when the user install a specific dependence

datacontract-cli[databricks]==0.10.8

https://github.com/datacontract/datacontract-cli/issues/286

pierre-monnet commented 3 days ago

Hi @teoria I cloned your branch and test it and I have the same issue :/

File <command-1477724087563699>, line 3
      1 import os, sys
      2 sys.path.append(os.path.abspath('/datacontract-cli-fix_dynamic_import/'))
----> 3 from datacontract.data_contract import DataContract
      5 data_contract = DataContract(spark=spark)
      6 spec = DataContract.init()
File /datacontract-cli-fix_dynamic_import/datacontract/data_contract.py:17
     15 from datacontract.export.exporter import ExportFormat
     16 from datacontract.export.exporter_factory import exporter_factory
---> 17 from datacontract.imports.importer_factory import importer_factory
     19 from datacontract.integration.publish_datamesh_manager import publish_datamesh_manager
     20 from datacontract.integration.publish_opentelemetry import publish_opentelemetry
File /datacontract-cli-fix_dynamic_import/datacontract/imports/importer_factory.py:39
     34         importer_factory.register_importer(import_format, importer_class)
     37 importer_factory = ImporterFactory()
---> 39 load_importer(
     40     import_format=ImportFormat.avro, module_path="datacontract.imports.avro_importer", class_name="AvroImporter"
     41 )
     42 load_importer(
     43     import_format=ImportFormat.bigquery,
     44     module_path="datacontract.imports.bigquery_importer",
     45     class_name="BigQueryImporter",
     46 )
     47 load_importer(
     48     import_format=ImportFormat.glue, module_path="datacontract.imports.glue_importer", class_name="GlueImporter"
     49 )
File /datacontract-cli-fix_dynamic_import/datacontract/imports/importer_factory.py:33, in load_importer(import_format, module_path, class_name)
     31 module = lazy_module_import(module_path)
     32 if module:
---> 33     importer_class = getattr(module, class_name)
     34     importer_factory.register_importer(import_format, importer_class)
File <frozen importlib.util>:247, in __getattribute__(self, attr)
File /datacontract-cli-fix_dynamic_import/datacontract/imports/avro_importer.py:1
----> 1 import avro.schema
      3 from datacontract.imports.importer import Importer
      4 from datacontract.model.data_contract_specification import DataContractSpecification, Model, Field
teoria commented 3 days ago

Huuuum I validated the module but not the dependencies. @pierre-monnet thanks I'll add dependencies validations

teoria commented 3 days ago

@pierre-monnet I loaded the modules dynamically and used exception handling for uninstalled dependencies its not a good practice but will fix the issue. could you try again ?

@jochenchrist I think that instantiating just the importer that will be used is a good improvement in memory usage.

pierre-monnet commented 3 days ago

@pierre-monnet I loaded the modules dynamically and used exception handling for uninstalled dependencies its not a good practice but will fix the issue. could you try again ?

@jochenchrist I think that instantiating just the importer that will be used is a good improvement in memory usage.

The import works now ! 🚀