HDFGroup / h5pyd

h5py distributed - Python client library for HDF Rest API
Other
109 stars 39 forks source link

Add test workflow to Github #172

Closed mattjala closed 3 months ago

mattjala commented 3 months ago

This executes the h5pyd tests on each PR to master. Logs from workflow execution on my repo are here.

mattjala commented 3 months ago
grantbuster commented 3 months ago

I can't get this test to run successfully on my end but i can't get anything hsds/h5pyd related to run right now and i think we desperately need this.

@mattjala thoughts on using pip to install hsds? If the default guidance is to install from cloned repo that is fine but hsds readme says you can install from pip. Whatever is most stable would be my preference...

Could we also have the hsds repo run this test?

I would really appreciate a simple h5pyd test that does this:


    with h5pyd.Folder('/nrel/') as d:
        print(list(d))
jreadey commented 3 months ago

@grantbuster - there are some updates over the last year or so that should make it fairly easy to run h5pyd+hsds whether it's as a github action, your notebook, or using Docker/Kubernetes. Here's probably the easiest path to what you are looking for:

  1. $ pip install hsds
  2. $ pip install h5pyd
  3. $ export AWS_S3_GATEWAY=http://s3.us-west-2.amazonaws.com
  4. $ export BUCKET_NAME=nrel-pds-hsds
  5. $ hsds &
  6. $ export HS_ENDPOINT=http://localhost:5101
  7. $ export HS_USERNAME=$USER
  8. $ export HS_PASSWORD=$USER
  9. $ hsls /nrel/

Note that you don't need to clone any github repos or run any build scripts. Instead of the exports after step 5, you can also just run hsconfigure and answer the questions for endpoint, username, and password. On Windows, you'll need to equivalent setenv commands.

What doesn't work currently (and I don't think ever worked), was being able to have HSDS read from an S3 bucket and write to a posix directory. I have some ideas for fixing that that I hope to get in the next release.