carpentries-incubator / hpc-intro

An Introduction to High Performance Computing
https://carpentries-incubator.github.io/hpc-intro/
Other
141 stars 141 forks source link

Assumption that python3 does not exist #310

Open ocaisa opened 3 years ago

ocaisa commented 3 years ago

In https://github.com/carpentries-incubator/hpc-intro/blame/gh-pages/_episodes/14-modules.md#L100 the assumption is made that python3 does not exist in the users environment by default. There are many cases where this is unlikely to be true. Can we think of something a bit more likely?

ocaisa commented 3 years ago

In general this probably means that this part of the lesson is now a little outdated.

mikerenfro commented 3 years ago

I'd suggested a Miniconda module a couple times in the hpc-carpentry Slack channel (2020-08-23 and 2021-02-07). Could be a full Anaconda module, since that would fit into the Pi episode requiring numpy, as noted in #302.

ocaisa commented 3 years ago

I'd be against this and so would Compute Canada: https://docs.computecanada.ca/wiki/Anaconda/en

ocaisa commented 3 years ago

Sorry, that was a bit blunt. What I wanted to say was that I know quite a few sites that explicitly advise against Conda, so it's unlikely to be an option that fits everyone.

I think what is needed is a just a template for a (set of) module load commands that will give you the right environment, which I believe right now is mpi4py + numpy + python3 (maybe even python3 should just be python). I made a step in that direction in https://github.com/carpentries-incubator/hpc-intro/pull/312#discussion_r579282355.

tkphd commented 3 years ago

Respectfully, we serve a broad community. Both LMod and Anaconda are popular approaches to software maintenance. Conda gives users control over their software stack, for the proce of a little more disk space. VirtualEnv and Singularity are great options as well, but not as widespread, yet. I'm in favor of crafting a Conda episode, as a direct analogue to the Modules episode, for Instructors to swap in as needed. Modules will remain the default in the lesson material, so ComputeCanada need not worry.

ocaisa commented 3 years ago

I understand that but it's important to note that the price of Anaconda is not just disk space, it is also frequently performance. I have no objections to having an alternative episode though.

jstaf commented 3 years ago

When I originally picked python3 as an example of loading a module, python3 was mainly used because it was something that was relatively cross-domain and wasn't there by default on CentOS 6/7. Now that python3 is the standard in all modern Linux distributions you might consider switching this to R. R is widely used, virtually guaranteed to be present in a module, and is present in multiple Software Carpentry workshops.

Alternatively, you could just mention that the system python3 won't be as performant, up-to-date, or as convenient as python3 provided by a module and leave things at that.