Open Samcoodess opened 1 year ago
@Samcoodess On a dev branch on your fork can you try making an environment with an environment.yml
file that looks something like the following
name: rcfm-analysis
channels:
- conda-forge
dependencies:
- python=3.11
- pandas=1.5 # Restricting this to sub pandas v2.0 as the pandas API changed here
- matplotlib
- notebook
- jupyterlab
(you can expand this as needed later).
You'd create this environment initially with
conda create --file environment.yml
and then once the environment is active
conda activate rcfm-analysis
if you update the environment.yml
file in the activated environment of the same name you can just do
conda env update --file environment.yml
# micromamba install --file environment.yml # The command is different for mamba/micromamba
to install those new dependencies into your environment.
Try this and let us know how things go.
For another Fellow project I'm mentoring we're also discussing environment files on https://github.com/AndriiPovsten/Snakemake-backend-for-RECAST/issues/5, so feel free to cross-post and to talk to people like Andrii as well.
Ok, @matthewfeickert Thank you. I will review his works and cross-posting seems like a great idea.
@matthewfeickert
I created a new environment using the environment.yml file in my dev branch of the forked RCFM repository. Then, I added my created environment in .gitignore as we had discussed earlier. Now, should I also not commit the environment.yml file to the dev branch?
Now, should I also not commit the environment.yml file to the dev branch?
The environment.yml
should be under version control. This is (one of) the thing(s) you want to share with everyone so that they can setup an environment that works with the code you have. So you can add this and push your dev branch to your fork (https://github.com/Samcoodess/RCFM).
Thanks @matthewfeickert. Here is the link to the dev branch : RCFM dev branch
It has an environment.yml file, and the environment "rcfm-analysis" is added to the gitignore.
Cool. :+1: What you have in there now is the same file as I gave as an example in https://github.com/Samcoodess/reana-dms/issues/5#issuecomment-1662917991, which is fine, but I just gave that as an example and the environment doesn't really have anything to do with the analysis. What you should now do is figure our what are additional dependencies that are needed to be added to the environment.yml
so that anyone who installs the described environment will be ready to do work.
A good start of what to look for is just by checking the output of running
git grep "import "
at the top level of the repository and seeing what modules end up getting imported. This will be a good start, but you'll need to refine things as some imported modules might be part of the Python standard library and so aren't something you can define as an external dependency, and some modules might be dependencies of other libraries used (e.g. numpy
is a dependency of scipy
and scipy
tightly restricts what versions of numpy
are allowed with each version (more on this if you're wanting a deep dive) so it doesn't make sense to add both scipy
and numpy
to your dependencies list).
A check that you have something close to the right environment specified is if you can run the analysis notebooks, deactivate and delete the environment, create it again from the environment.yml
file, and then rerun the analysis notebooks.
Please ask any questions you might have along the way. Learning how to manage virtual environment dependencies is not easy right away. :)
Hello mentors, I am having an issue to run the target analysis RCFM locally. As specified in the GitHub repo Readme for setup. I tried the recommended version i.e. with Anaconda Python 3.6 version.
All the steps to run the analysis are given in detail below :
First, I forked the target analysis repository creating my own forked version of the analysis. Forked RCFM @Samcoodess
I cloned the forked repo in my Vscode terminal using
git clone {SSH}
Then, I navigated to my project directory and created an environment using
conda create newenv
I activated the environment using
conda activate newenv
Then, to install python=3.6 as mentioned in the GitHub readme, I ran
conda install python=3.6
. Upon running this, I had an ERROR -> I figured out that Python 3.6 isn't available in the default channels provided by Anaconda for macOS on ARM architecture and python=3.6 is pretty much dead.I checked my Python version
python3 --version
Then, I tried installing matplotlib and jupyter notebook using
conda install -n newenv jupyter notebook
andconda install -n newenv matplotlib
. All the packages were installed.Used command
jupyter notebook
to navigate through my notebooks and ranmodel.ipynb
Upon running the
model.ipynb
in the kernel of my environment, The first block of the code which is to import necessary modules throws an error ModuleNotFoundErrorI thought my kernel wasn't using the conda environment and to check I ran
import sys
print(sys.executable)
But it's taking too long to run just these two lines.When I tried installing matplotlib in my notebook itself by using
!pip install matplotlib
, it said