HDFGroup / hsds

Cloud-native, service based access to HDF data
https://www.hdfgroup.org/solutions/hdf-kita/
Apache License 2.0
126 stars 52 forks source link

can't find hsds-node executable #211

Closed assaron closed 1 year ago

assaron commented 1 year ago

Hi, I'm trying to run hsds server from this package without docker and running into the following problem:

~/workspace/hsds$ ./runall.sh --no-docker
--no_docker option specified - using directory: /tmp/hs for socket and log files
using password file: admin/config/passwd.default
no AWS or AZURE env set, using POSIX storage
no docker startup
Using posix storage: /home/alserg/hsds_data
set logging to: 20
logfile: /home/alserg/workspace/hsds/hs.log
INFO:root:using cmd_dir: /usr/local/bin
Traceback (most recent call last):
  File "/usr/local/bin/hsds", line 33, in <module>
    sys.exit(load_entry_point('hsds==0.7.2', 'console_scripts', 'hsds')())
  File "/usr/local/lib/python3.9/dist-packages/hsds-0.7.2-py3.9.egg/hsds/app.py", line 333, in main
    app.run()
  File "/usr/local/lib/python3.9/dist-packages/hsds-0.7.2-py3.9.egg/hsds/hsds_app.py", line 272, in run
    raise FileNotFoundError("can't find hsds-node executable")
FileNotFoundError: can't find hsds-node executable

However, there is hsds-node executable in /usr/local/bin:

$ which hsds-node
/usr/local/bin/hsds-node

Why I could getting this error?

jreadey commented 1 year ago

Hi @assaron I just tried this from a clean environment on my Linux notebook and didn't have any problems.

Do you have /usr/local/bin/ in your PATH? If you run: "hsds-node" on the command line, what happens?

assaron commented 1 year ago

@jreadey Yes, I have it, that's why I don't understand what's going on:

$ which hsds-node
/usr/local/bin/hsds-node
$ hsds-node --node_type=sn
hsds node main for node_type: sn
python version: 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110]
sys path: ['/usr/local/bin', '/usr/lib/python39.zip', '/usr/lib/python3.9', '/usr/lib/python3.9/lib-dynload', '/home/alserg/.local/lib/python3.9/site-packages', '/usr/local/lib/python3.9/dist-packages', '/usr/local/lib/python3.9/dist-packages/requests_unixsocket-0.3.0-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/PyJWT-2.4.0-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/numcodecs-0.10.2-py3.9-linux-x86_64.egg', '/usr/local/lib/python3.9/dist-packages/aiohttp_cors-0.7.0-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/aiohttp-3.8.1-py3.9-linux-x86_64.egg', '/usr/local/lib/python3.9/dist-packages/aiofiles-0.8.0-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/aiobotocore-2.1.0-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/yarl-1.8.1-py3.9-linux-x86_64.egg', '/usr/local/lib/python3.9/dist-packages/multidict-6.0.2-py3.9-linux-x86_64.egg', '/usr/local/lib/python3.9/dist-packages/frozenlist-1.3.1-py3.9-linux-x86_64.egg', '/usr/local/lib/python3.9/dist-packages/charset_normalizer-2.1.0-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/async_timeout-4.0.2-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/aiosignal-1.2.0-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/wrapt-1.14.1-py3.9-linux-x86_64.egg', '/usr/local/lib/python3.9/dist-packages/botocore-1.23.24-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/aioitertools-0.10.0-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/jmespath-0.10.0-py3.9.egg', '/usr/local/lib/python3.9/dist-packages/hsds-0.7.2-py3.9.egg', '/usr/lib/python3/dist-packages']
INFO> Service node initializing
INFO> service node initializing
setLogConfig - level=INFO
INFO> Application baseInit
INFO> using node port: 5101
INFO> setting node_id to: sn-a2a4b
INFO> using bucket: hsdstest
INFO> using node port: 5101
INFO> aws_iam_role set to: hsds_role
INFO> aws_secret_access_key not set
INFO> aws_access_key_id not set
INFO> aws_region set to: us-east-1
INFO> Using metadata memory cache size of: 134217728
INFO> allow_noauth = True
INFO> initUserDB
WARN> No password file, file /config/passwd.txt not found
INFO> user_db initialized: 0 users
INFO> initgroupDB
INFO> No groups file
INFO> group_db initialized: 0 groups
INFO> run_app on port: 5101
======== Running on http://0.0.0.0:5101 ========
(Press CTRL+C to quit)
INFO> health check start
INFO> healthCheck - node_state: WAITING
INFO> register: http://head:5100/register
INFO> register req: http://head:5100/register body: {'id': 'sn-a2a4b', 'port': 5101, 'node_type': 'sn'}
INFO> http_post('http://head:5100/register', 3 bytes)
INFO> Initiating TCPConnector for http://head:5100/register with limit 100 connections
WARN> ClientError for http_post(http://head:5100/register): Cannot connect to host head:5100 ssl:default [Name or service not known] 
ERROR> HEAD node seems to be down.
ERROR> Unexpected UnboundLocalError exception in doHealthCheck: local variable 'rsp_json' referenced before assignment

I've added some debug output, and for some reason cmd_path being checked in hsds_app.py is /usr/bin/hsds-node, not /usr/local/bin/hsds-node.

The debug code:

        cmd_path = os.path.join(sys.exec_prefix, "bin")
        cmd_path = os.path.join(cmd_path, "hsds-node")
        logging.info(f"sys.exec_prefix = {sys.exec_prefix}")
        logging.info(f"cmd_path = {cmd_path}")
        print(os.path.isfile(cmd_path))
        if not os.path.isfile(cmd_path):
            # search corresponding location for windows installs
            cmd_path = os.path.join(sys.exec_prefix, "Scripts")
            cmd_path = os.path.join(cmd_path, "hsds-node-script.py")
            if not os.path.isfile(cmd_path):
                raise FileNotFoundError("can't find hsds-node executable")

The output of hsds command:

$ hsds --root_dir /home/alserg/workspace/hsds/data --password_file admin/config/passwd.default --logfile hs.log --socket_dir /tmp/hs --loglevel INFO --config_dir=admin/config --count=4
set logging to: 20
logfile: /home/alserg/workspace/hsds/hs.log
INFO:root:using cmd_dir: /usr/local/bin
INFO:root:sys.exec_prefix = /usr
INFO:root:cmd_path = /usr/bin/hsds-node
False
Traceback (most recent call last):
  File "/usr/local/bin/hsds", line 33, in <module>
    sys.exit(load_entry_point('hsds==0.7.2', 'console_scripts', 'hsds')())
  File "/usr/local/lib/python3.9/dist-packages/hsds-0.7.2-py3.9.egg/hsds/app.py", line 333, in main
    app.run()
  File "/usr/local/lib/python3.9/dist-packages/hsds-0.7.2-py3.9.egg/hsds/hsds_app.py", line 275, in run
    raise FileNotFoundError("can't find hsds-node executable")
FileNotFoundError: can't find hsds-node executable
assaron commented 1 year ago

I have no idea why sys.exec_prefix for me is /usr, and not the default (according to documentation) /usr/local. I'm using system-installed (Debian) python package python3.9, version 3.9.2-1.

Still, is there a reason why sys.exec_prefix is checked instead of cmd_dir?

assaron commented 1 year ago

And I've checked some other Ubuntu-based installations, there for system-wide Python sys.exec_prefix is /usr/

jreadey commented 1 year ago

I've only ever tested the no-docker option with Anaconda python. In Anaconda the sys.exec_prefix points to something like: /home/jreadey/anaconda3/envs/my_conda_env/bin which is where the hsds executables get dumped. I guess it would be easy enough to have the code look at /usr/local/bin/ if it doesn't find anything in the exec_prefix directory. I'll give this a try.

Is there a specific reason you need to run with native python?

assaron commented 1 year ago

No, there is no specific reason why I need to use native python, but it was also a bit unexpected that it doesn't work, while all of the dependencies are installed.

As for the fix, doesn't it make sense just to use cmd_dir like below? It looks like it fixes the issue for me (at least being able to start the server).

diff --git a/hsds/hsds_app.py b/hsds/hsds_app.py
index 9e4310a..d236bcd 100644
--- a/hsds/hsds_app.py
+++ b/hsds/hsds_app.py
@@ -262,8 +262,7 @@ class HsdsApp:
             common_args.append(f"--log_level={self._loglevel}")

         py_exe = sys.executable
-        cmd_path = os.path.join(sys.exec_prefix, "bin")
-        cmd_path = os.path.join(cmd_path, "hsds-node")
+        cmd_path = os.path.join(self._cmd_dir, "hsds-node")
         if not os.path.isfile(cmd_path):
             # search corresponding location for windows installs
             cmd_path = os.path.join(sys.exec_prefix, "Scripts")
jreadey commented 1 year ago

Glad you were able to get it working.

I've always been a little leery of changing the system python packages (e.g. what if a different python app requires different package versions?), but I'm fine with making this change. I'd reverse the search order though - look in exec_prefix, then cmd_dir. If I don't have an active conda environment, it won't find anything in exec_prefix. If I am running conda, I'd expect to use the version in the conda path.

assaron commented 1 year ago

If I am running conda, I'd expect to use the version in the conda path.

I thought that was the point of cmd_dir, it tries to find the appropriate location of hsds installation, be it in conda or native, no?

assaron commented 1 year ago

I've tried to install hsds with my change inside a conda environment and it seems to work just fine:

$ hsds --root_dir /home/alserg/workspace/hsds/data --password_file admin/config/passwd.default --logfile hs.log --socket_dir /tmp/hs --loglevel INFO --config_dir=admin/config --count=1                  
set logging to: 20
logfile: /home/alserg/workspace/hsds/hs.log
INFO:root:using cmd_dir: /home/alserg/miniconda3/envs/hsds/bin
INFO:root:all processes ready!
INFO:root:Ready after: 0.00 s
jreadey commented 1 year ago

@assaron - I've updated the unbloundlocal branch to fix this issue and some other updates for running without docker. Can you give it a try?

assaron commented 1 year ago

@jreadey Thanks! It seems to work well! My issue is resolved then

jreadey commented 1 year ago

Great! I've merged the changes into master.

Will-Gorman commented 9 months ago

I am seeing the same issue as above, but I think its for the more basic issue of potentially not having a potential location in PATH. But, I am not sure what I need to add on my Windows machine to PATH.

When I try to start the HSDS server with runall.bat I get:

DEBUG:root:looking for hsds-servicenode in PATH env var folders
INFO:root:using cmd_dir: C:\Users\wgorman\Anaconda3\envs\hsds\Scripts
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\wgorman\Anaconda3\envs\hsds\Scripts\hsds.exe\__main__.py", line 7, in <module>
  File "C:\Users\wgorman\Anaconda3\envs\hsds\Lib\site-packages\hsds\app.py", line 356, in main
    app.run()
  File "C:\Users\wgorman\Anaconda3\envs\hsds\Lib\site-packages\hsds\hsds_app.py", line 268, in run
    raise FileNotFoundError("can't find hsds-node executable")
FileNotFoundError: can't find hsds-node executable

And when I run hsds-node, I get an error: "no node_type argument found"

What location do I need to add to PATH?