dguest / pandamonium

Command line scripts to parse panda web api
BSD 3-Clause "New" or "Revised" License
28 stars 17 forks source link

Are we voms or are we vomsless? #35

Open dguest opened 4 years ago

dguest commented 4 years ago

I realized that we're depending on panda-client.

Of course this has always been true, since there are two ways to use pandamonium:

  1. You just use pandamon which depends on ~nothing and should run on any system
  2. You also use panda-resub-taskid and panda-kill-taskid which depend on panda-client, and through that, voms-proxy-init

On a machine where voms-proxy-init doesn't work, panda-client doesn't work, and therefore the whole second use-case doesn't work. On the other hand, every machine that has voms has panda (counterexamples welcome). I find it sloppy to list just one of them (panda-client) as a pip dependency if we can't use it.

@matthewfeickert pointed out that that rather than removing panda-client from the dependencies, we could remove everything that depends on voms-proxy-init in the base version, since in our new automated CI greatness we can just push two versions (e.g. pandamonium and pandamonium-voms), where the later can also include the voms dependent stuff.

So, is it worth it? Will that confuse people if panda-*-taskid go away in more recent versions? Should we call it pandamonium-light?

matthewfeickert commented 4 years ago

On the other hand, every machine that has voms has panda (counterexamples welcome).

This isn't a guarantee. This is true for a CVMFS enabled cluster, but if you're on your own machine and trying to install the minimal amount of software this isn't true. Put another way: panda has a dependency on voms-clients, but voms-clients has no dependency on panda. c.f. the docs for voms-clients.

As a short example that demonstrates why @kratsg has installed additional certs in his Dockerfile and that one should view dependencies on panda-client and voms-clients as separate

$ docker run --rm -it -e user=$USER python:3.8 /bin/bash
# apt update -qq && apt install -qq -y voms-clients
# python -m pip install -q --upgrade pip setuptools wheel
# python -m pip install --upgrade "git+https://github.com/dguest/pandamonium.git"
# python -m pip list
Package      Version
------------ --------
panda-client 1.4.36
pandamonium  0.2.dev3
pip          20.2.3
setuptools   50.3.0
wheel        0.35.1
# panda-kill-taskid 14353
INFO : Need to generate a grid proxy
Enter GRID pass phrase for this identity:
ERROR : /bin/sh: 1: source: not found
Cannot open fileFilename=/root/.rnd
Function: RAND_load_file
unable to access trusted certificates in:x509_cert_dir=/etc/grid-security/certificates
Function: proxy_init_cred
ERROR : Could not generate a grid proxy
# python -m pip uninstall -y panda-client
# python -m pip list
Package     Version
----------- --------
pandamonium 0.2.dev3
pip         20.2.3
setuptools  50.3.0
wheel       0.35.1
# panda-kill-taskid 14353
Failed to load PandaClient, please set up locally

I find it sloppy to list just one of them (panda-client) as a pip dependency if we can't use it.

As there are explicit dependencies on panda-client (which is confusingly named given that the library it gives you is named pandatools :disappointed:) in

https://github.com/dguest/pandamonium/blob/2c9951664b541199120530dd0df838086f6c4e4f/src/pandamonium/panda_kill_taskid.py#L17

and

https://github.com/dguest/pandamonium/blob/2c9951664b541199120530dd0df838086f6c4e4f/src/pandamonium/panda_resub_taskid.py#L18

a Python library should make it clear what the dependencies it has are in setup.py/setup.cfg as is done now

https://github.com/dguest/pandamonium/blob/2c9951664b541199120530dd0df838086f6c4e4f/setup.cfg#L41-L42

Otherwise you get into situations where you try to install a library and it fails because you don't have the dependencies for it and the library didn't specify them (example: libraries that import numpy as np in their core but don't have numpy as a requirement). This is a nightmare for everyone involved. Basically, if you import something in your library, make sure that you specify you depend on it. To be fair, that isn't really an issue here, but it is a good idea to make it clear to both pip and users what your library truly depends on.

@matthewfeickert pointed out that that rather than removing panda-client from the dependencies, we could remove everything that depends on voms-proxy-init in the base version,

This is actually not what I'm suggesting. To be clear, I think that having a dependency on panda-client is extremely light and not a problem at all. However, if you wanted to say to someone that "You can use pandamon by pip installing pandamonium which has no requirements on any other library" then you could achieve this (maybe with some light library restructuring) by having panda-client be an optional install in setup.py named voms or something. So in the same way that you can currently do

# python -m pip install pandamonium[lint] # Not yet as we don't have the PyPI namespace
python -m pip install --upgrade "git+https://github.com/dguest/pandamonium.git#egg=pandamonium[lint]"

and get pyflakes and black installed given

https://github.com/dguest/pandamonium/blob/2c9951664b541199120530dd0df838086f6c4e4f/setup.py#L4

you could do

python -m pip install pandamonium[voms]

and get panda-client.

since in our new automated CI greatness we can just push two versions (e.g. pandamonium and pandamonium-voms), where the later can also include the voms dependent stuff.

No, you actually just distribute one library. It is just that as setup.py is a Python script it gets interpreted at runtime and makes decisions about what needs to get installed given the inputs from pip. :+1:

So, is it worth it? Will that confuse people if panda-*-taskid go away in more recent versions? Should we call it pandamonium-light?

I don't know if I'm the best at knowing what will and won't confuse people. There's no need to call it anything different, but I think that documentation is going to be pretty important. The good(?) news though is that I think most of the current users of pandamonium aren't doing new setups a lot, but given the decisions being made in this Issue it might be worth broadcasting to the user base what's going on and asking for feedback. @kratsg might also have thoughts on this.

dguest commented 4 years ago

Ah, thanks for explaining the details of these installation options.

I sort of like the idea of having a pip install pandamonium[light] option, together with some kind of check for voms-proxy-init in the standard setup script that can warn people if they are about to install on a system that might not support everything they hope for.

matthewfeickert commented 4 years ago

I sort of like the idea of having a pip install pandamonium[light] option

This is usually the other way around. You compose additional requirements into the extra from a starting base, rather than trying to prune them away. i.e. one should be able to do pip install --upgrade <base library> and have that work at anytime without breaking the distribution setup that was installed through pip install <base library>[extra].