richardleeaus commented 5 years ago

Hi

It appears the RunConfiguration.load() does not work. For example code snippit:

hdi_run_config = RunConfiguration()
hdi_run_config.load(path='.', name='hdi')

It finds the file - but after putting a debug point after the load, the object clearly isn't loaded with the hdi.runconfig, and is instead loaded with a default local.runconfig (which isn't in the project). In addition to this, after doing a exp.submit, it automatically creates a aml_config directory with docker.runconfig, local.runconfig, conda_dependencies.yml which is not related to what we doing (these are just templates anyway).

j-martens commented 5 years ago

Thanks for your report. Are you still experiencing this issue? Thank you!

v-strudm-msft commented 4 years ago

Thank you for reaching out to us. We see our answer was delayed. Our apologies. We did not receive a response to our post, so will close this issue for now. Should you need further assistance, please submit a post on this forum and we will respond promptly.

Pratibha2007 commented 3 years ago

I am also facing the same issue . What's the solution to get rid of this issue?

bandsina commented 3 years ago

Hi @Pratibha2007 , what is your scenario? Can you please help us understand your scenario so we can figure out the best way to achieve it?

Pratibha2007 commented 3 years ago

Thanks for reaching out !

I created a Yaml file to update python version and add few pip/conda dependencies but then it when i check runconfig after loading from this yaml I can still see the default runconfig. Also I tried second way of initializing variables from Conda Dependency object , this time when I see runconfig , i can see dependencies being listed but they never appear spark Driver logs in databricks.

runconfig = RunConfiguration() runconfig.load(path='/dbfs/mnt/dev/test/',name='test1') and contents of test1 are :

script: arguments: [] target: local framework: Python communicator: None maxRunDurationSeconds: nodeCount: 1 priority: environment: name: version: environmentVariables: EXAMPLE_ENV_VAR: EXAMPLE_VALUE python: userManagedDependencies: false interpreterPath: python condaDependenciesFile: baseCondaEnvironment: condaDependencies: name: project_environment dependencies:

python=3.7.6
pip:
- azureml-defaults
- tensorflow==2.4.0
- pandas channels:
anaconda
conda-forge docker: enabled: false baseImage: mcr.microsoft.com/azureml/intelmpi2018.3-ubuntu16.04:20210104.v1 baseDockerfile: sharedVolumes: true shmSize: 2g arguments: [] baseImageRegistry: address: username: password: registryIdentity: platform: os: Linux architecture: amd64 spark: repositories: [] packages: [] precachePackages: true databricks: mavenLibraries: [] pypiLibraries: [] rcranLibraries: [] jarLibraries: [] eggLibraries: [] r: inferencingStackVersion: history: outputCollection: true snapshotProject: true directoriesToWatch:
- logs spark: configuration: spark.app.name: Azure ML Experiment spark.yarn.maxAppAttempts: 1 hdi: yarnDeployMode: cluster tensorflow: workerCount: 1 parameterServerCount: 1 mpi: processCountPerNode: 1 nodeCount: 1 pytorch: communicationBackend: nccl processCount: nodeCount: 1 paralleltask: maxRetriesPerWorker: 0 workerCountPerNode: 1 terminalExitCodes: dataReferences: {} data: {} outputData: {} sourceDirectoryDataStore: amlcompute: vmSize: vmPriority: retainCluster: false name: clusterMaxNodeCount: command: ''

2) code snippet for second way: conda_dep =CondaDependencies.create(conda_packages=['tensorflow==2.4.0']) conda_dep.add_pip_package("tensorflow_probability") conda_dep.add_pip_package("tensorflow==2.2.0") conda_dep.set_python_version("3.7.6") runconfig = RunConfiguration(conda_dependencies=conda_dep)

and snippet from spark driver UI , there is no Dependency section

Last way that I tried was directly specifying pypilibraries which works as expected but then I have no option to change python and pip version and hence cannot get the higher version of the packages I am looking for . Here I can see in dependencies:

I want to update python version using conda dependency to get tensorflow version I need for my project and submit the pipeline from azure databricks to azure ML step3=DatabricksStep(name = "train_data", run_name = 'train_data', num_workers = 4, notebook_path = data_train_path,

pypi_libraries=[PyPiLibrary(package='tensorflow==2.0.0b1')],

                # jar_libraries=[JarLibrary(library='dbfs:/FileStore/jars/24d2a4e9_9dd9_43ba_81bb_82a643566fff/feature_utils-0.3.20190125.2-py3-none-any.whl')],
                 compute_target = databricks_compute,
                 runconfig=runconfig,
                 node_type='Standard_DS12_v2',
                 allow_reuse = False
                 )

pipeline = Pipeline(workspace = ws, steps = step_sequence) print('Pipeline is built') pipeline.validate() print('Pipeline validation complete')

exp_name = 'eta_test_run' exp = Experiment(ws, exp_name) pipeline_run = exp.submit(pipeline) print('Pipeline is submitted for execution') pipeline_run.wait_for_completion(show_output = True)

but this diesn't work.

Apologies for the long thread but I tried everything I could here

Pratibha2007 commented 3 years ago

@bandsina is the any update on this issue please?

paulshealy1 commented 3 years ago

@Pratibha2007 I believe you're loading a different file than the one you intended. Instead of this:

runconfig.load(path='/dbfs/mnt/dev/test/',name='test1')

Can you please provide the full path to your config, like so?

runconfig.load(path='/dbfs/mnt/dev/test/test1')

Pratibha2007 commented 3 years ago

@paulshealy1 I already tried this but no help. The problem is something else because when I create conda depedency object explicitly in RunConfiguration construction , still I can't see dependencies being listed in jobs UI in databricks. So it seems runconfig is completely getting overridden when experiment.submit() gets called

msha1026 commented 3 years ago

I am also experiencing the same issue. Providing libraries through pypi_libraries works as expected. However, trying to use run_config does not install the libraries I need on the job clusters. Would appreciate any guidance on this issue.

v-kumudam commented 3 years ago

@msha1026 RunConfiguration.load() does not work Is this issue resolved?

v-kumudam commented 3 years ago

@richardleeaus RunConfiguration.load() does not work problem is resolved? Currently we are unable to replicate this github repo.

Azure / MachineLearningNotebooks

RunConfiguration.load() does not work #322

runconfig = RunConfiguration() runconfig.load(path='/dbfs/mnt/dev/test/',name='test1') and contents of test1 are :

pypi_libraries=[PyPiLibrary(package='tensorflow==2.0.0b1')],