AgileBioFoundry / multiomicspaper

Other
0 stars 4 forks source link

Issues in installation of packages from the requirements file #20

Open Vaibhav-22-dm opened 2 years ago

Vaibhav-22-dm commented 2 years ago

I am facing issues to build the two kernels mentioned in the kernel_requirements folder. I suppose both the kernels use a different version of Python because when I tried to install packages directly by running pip install -r requirements_art_3.6.txt, too many errors arose. It will be very helpful if it is clarified which version of python is required for which kernel as the requirements seems to be updated two years ago.

After failing several times, I installed the Anaconda package manager and installed the following packages -

  1. pipenv: 2022.5.2
  2. depinfo: 1.7.0
  3. python-libsbml(or just libsbml): 5.19.5
  4. rfc3986: 2.0.0
  5. h11: 0.13.0
  6. rich: 12.4.4
  7. pydantic: 1.9.1
  8. diskcache: 5.4.0
  9. importlib_resources: 5.7.1
  10. Semver: 2.13.0
  11. Pathvalidate: 2.5.0
  12. pydoe: 0.3.8
  13. tpot: 0.11.7
  14. edd-utils: 0.0.12
  15. pytorch: 1.11.0
  16. mpi4py: 3.1.3
  17. pymc3: 3.11.4
  18. blas: 1.0

Python Version(provided by Anaconda) - 3.9.12

After installing all these packages and CPLEX, I successfully ran notebooks A, B, and C but I got stuck at notebook D when I ran the following code cell -

%%time
if run_art:
    art = RecommendationEngine(df, **art_params)
else:
    with open(os.path.join(art_params['output_directory'], 'art.pkl'), 'rb') as output:
        art = pickle.load(output)

Following is the error that pops up -

Defaulting to a maximum of 6 cores for MCMC sampling (all available).  See the max_mcmc_cores parameter to control ART's use of parallelism.
Warning: Dataframe does not have a time column matching one of the supported formats. Assuming that all data in the file comes from a single time point.
C:\Users\vaibh\anaconda3\lib\site-packages\xgboost\compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
  from pandas import MultiIndex, Int64Index
C:\Users\vaibh\anaconda3\lib\site-packages\pandas\core\internals\blocks.py:938: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  arr_value = np.asarray(value)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File <timed exec>:2, in <module>

File C:\My Drive\D Drive Vaibhav\Machine Learning\Foreign Training\multiomicspaper\notebooks\../../AutomatedRecommendationTool\art\core.py:388, in RecommendationEngine.__init__(self, df, input_vars, input_var_type, bounds_file, scale_input_vars, response_vars, build_model, cross_val, ensemble_model, standardize, intercept, recommend, objective, threshold, target_values, num_recommendations, rel_rec_distance, niter, alpha, output_directory, max_mcmc_cores, verbose, testing, seed, initial_cycle, warning_callback, last_dashes_denote_replicates, num_sklearn_models, num_tpot_models)
    386     self.save_pkl_object()
    387 elif build_model:
--> 388     self.build_model()
    389     if recommend:
    390         self.optimize()

File C:\My Drive\D Drive Vaibhav\Machine Learning\Foreign Training\multiomicspaper\notebooks\../../AutomatedRecommendationTool\art\core.py:591, in RecommendationEngine.build_model(self)
    588 self._initialize_models()
    590 if self.cross_val:
--> 591     self._cross_val_models()
    592     plot.predictions_vs_observations(self, cv_flag=True, errorbars_flag=True)
    594 self._fit_models()

File C:\My Drive\D Drive Vaibhav\Machine Learning\Foreign Training\multiomicspaper\notebooks\../../AutomatedRecommendationTool\art\core.py:1042, in RecommendationEngine._cross_val_models(self)
   1035         cv_predictions[j][i] = level0_cv_predictions
   1037 # ================================================== #
   1038 # Cross validated predictions for the ensemble model
   1039 # -------------------------------------------------- #
   1040 
   1041 # Build (fit) ensemble model
-> 1042 self._build_ensemble_model(idx=train_idx)
   1044 # Predictions with ensemble model
   1045 # Apart from the mean values, store prediction std and draws for plotting
   1046 # (not possible always due to a bug in pymc3)
   1047 f = np.zeros((len(test_idx), self.num_models, self.num_response_var))

File C:\My Drive\D Drive Vaibhav\Machine Learning\Foreign Training\multiomicspaper\notebooks\../../AutomatedRecommendationTool\art\core.py:968, in RecommendationEngine._build_ensemble_model(self, idx)
    965 if self.standardize:
    966     self._standardize_level1_data()
--> 968 self._ensemble_model(idx)

File C:\My Drive\D Drive Vaibhav\Machine Learning\Foreign Training\multiomicspaper\notebooks\../../AutomatedRecommendationTool\art\core.py:1407, in RecommendationEngine._ensemble_model(self, idx, testing)
   1397 if not testing:
   1398     # Instantiate sampler and draw samples from the posterior.
   1399     # Omit the random_seed parameter, since PYMC3 @3.8 internally calls
   (...)
   1404     # chains.  That should still be predictable since ART calls np.random.seed()
   1405     # above.
   1406     step = pm.NUTS()  # Slice, Metropolis, HamiltonianMC, NUTS
-> 1407     self.trace[j] = pm.sample(
   1408         const.n_iterations,
   1409         step=step,
   1410         initvals=initvals,
   1411         progressbar=progressbar,
   1412         tune=const.tune_steps,
   1413         cores=cores,
   1414         # work around an API update to be added in PYMC3 4.0
   1415         return_inferencedata=False,
   1416         # ,  init=adapt_diag
   1417         # live_plot=True, skip_first=100, refresh_every=300, roll_over=1000
   1418     )
   1420     logger = logging.getLogger("pymc3")
   1421     logger.propagate = True

File ~\anaconda3\lib\site-packages\pymc3\sampling.py:515, in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, pickle_backend, **kwargs)
    513         step = assign_step_methods(model, step, step_kwargs=kwargs)
    514 else:
--> 515     step = assign_step_methods(model, step, step_kwargs=kwargs)
    517 if isinstance(step, list):
    518     step = CompoundStep(step)

File ~\anaconda3\lib\site-packages\pymc3\sampling.py:217, in assign_step_methods(model, step, methods, step_kwargs)
    209         selected = max(
    210             methods,
    211             key=lambda method, var=var, has_gradient=has_gradient: method._competence(
    212                 var, has_gradient
    213             ),
    214         )
    215         selected_steps[selected].append(var)
--> 217 return instantiate_steppers(model, steps, selected_steps, step_kwargs)

File ~\anaconda3\lib\site-packages\pymc3\sampling.py:143, in instantiate_steppers(_model, steps, selected_steps, step_kwargs)
    141 unused_args = set(step_kwargs).difference(used_keys)
    142 if unused_args:
--> 143     raise ValueError("Unused step method arguments: %s" % unused_args)
    145 if len(steps) == 1:
    146     return steps[0]

ValueError: Unused step method arguments: {'initvals'}

I had raised the same issue in the AutomatedRecommendationTool repository. Over there, I got a suggestion to use the provided Docker workflow but as I want to implement the multiomicspaper I need to build the kernels as per the requirements mentioned here.

If it's possible can anyone please help me out to resolve the ValueError: Unused step method arguments: {'initvals'} or at least provide a step-by-step guide for the correct installation with required python versions.

mhgarci1 commented 2 years ago

I saw that you opened another ticket for ART. That should have all the information you need to make it work. Let me know if that is not the case.

Vaibhav-22-dm commented 2 years ago

Yes, I had raised the same issue in the ART repository, but they have asked to follow the docker workflow over there. In this repository, two requirements files are added to build the kernels. However, when I tried to install all the packages mentioned in one of those files by creating a virtual environment I got lots of conflicts and errors even though I had a fresh installation of python. I thought that the multiomicspaper repository may have a different way of building the kernels than ART as it has to use OMG and ART both. If that's not the case then I will follow the docker workflow itself. Please let me know if there is some standard way to build both kernels. Thanks for replying to my issue.