BiBiServ / bibigrid

BiBiGrid is a tool for an easy cluster setup inside a cloud environment.
Apache License 2.0
11 stars 8 forks source link
aws azure cloud-environment cluster openstack

BiBiGrid

BiBiGrid is a cloud cluster creation and management framework for OpenStack (and more providers in the future).

BiBiGrid uses Ansible to configure standard Ubuntu 20.04/22.04 LTS as well as Debian 11 cloud images. Depending on your configuration BiBiGrid can set up an HCP cluster for grid computing (Slurm Workload Manager, a shared filesystem (on local discs and attached volumes), a cloud IDE for writing, running and debugging (Theia Web IDE) and many more.

Note The latest version is currently work in progress. Future changes are likely. Not all features of the previous version are available, but they will come soon. The previous version is still available, but not maintained anymore.

Getting Started

For most users the Hands-On BiBiGrid Tutorial is the best entry point.

However, if you are already quite experienced with OpenStack and the previous BiBiGrid the following brief explanation might be just what you need.

Brief, technical BiBiGrid overview ### How to configure a cluster? #### Configuration File: bibigrid.yml A [template](bibigrid.yml) file is included in the repository ([bibigrid.yml](bibigrid.yml)). The cluster configuration file consists of a list of configurations. Every configuration describes the provider specific configuration. The first configuration additionally contains all the keys that apply to the entire cluster (roles for example). Currently only clusters with one provider are possible, so focus only on the first configuration in the list. The configuration template [bibigrid.yml](bibigrid.yml) contains many helpful comments, making completing it easier for you. [You need more details?](documentation/markdown/features/configuration.md) #### Cloud Specification Data: clouds.yml To access the cloud, authentication information is required. You can download your `clouds.yaml` from OpenStack. Your `clouds.yaml` is to be placed in `~/.config/bibigrid/` and will be loaded by BiBiGrid on execution. [You need more details?](documentation/markdown/features/cloud_specification_data.md) ### Quick First Time Usage If you haven't used BiBiGrid1 in the past or are unfamiliar with OpenStack, we heavily recommend following the [tutorial](https://github.com/deNBI/bibigrid_clum2022) instead. #### Preparation 1. Download (or create) the `clouds.yaml` (and optionally `clouds-public.yaml`) file as described [above](#cloud-specification-data-cloudsyml). 2. Place the `clouds.yaml` into `~/.config/bibigrid` 3. Fill the configuration, `bibigrid.yml`, with your specifics. At least you need: A master instance with valid type and image, a region, an availability zone, an sshUser (most likely ubuntu) and a subnet. You probably also want at least one worker with a valid type, image and count. 4. If your cloud provider runs post-launch services, you need to set the `waitForServices` key appropriately which expects a list of services to wait for. 5. Create a virtual environment from `bibigrid/requirements.txt`. See [here](https://www.akamai.com/blog/developers/how-building-virtual-python-environment) for more detailed info. 6. Take a look at [First execution](#first-execution) #### First execution Before follow the steps described at [Preparation](#preparation). After cloning the repository navigate to `bibigrid`. In order to execute BiBiGrid source the virtual environment created during [preparation](#preparation). Take a look at BiBiGrid's [Command Line Interface](documentation/markdown/features/CLI.md) if you want to explore for yourself. A first execution run through could be: 1. `./bibigrid.sh -i [path-to-bibigrid.yml] -ch`: checks the configuration 2. `./bibigrid.sh -i 'bibigrid.yml -i [path-to-bibigrid.yml] -c'`: creates the cluster (execute only if check was successful) 3. Use **BiBiGrid's create output** to investigate the created cluster further. Especially connecting to the ide might be helpful. Otherwise, connect using ssh. 4. While in ssh try `sinfo` to printing node info 5. Run `srun -x $(hostname) hostname` to power up a worker and get its hostname. 6. Run `sinfo` again to see the node powering up. After a while it will be terminated again. 7. Use the terminate command from **BiBiGrid's create output** to shut down the cluster again. All floating-ips used will be released. Great! You've just started and terminated your first cluster using BiBiGrid!

Troubleshooting

If your cluster doesn't start up, please first make sure your configurations file is valid (-ch). If it is not, try to modify the configurations file to make it valid. Use -v or -vv to get a more verbose output, so you can find the issue faster. Also double check if you have sufficient permissions to access the project. If you can't make your configurations file valid, please contact a developer. If that's the case, please contact a developer and/or manually check if your quotas are exceeded. Some quotas can currently not be checked by bibigrid.

Whenever you contact a developer, please send your logfile along.

Documentation

If you would like to learn more about BiBiGrid please follow a fitting link:

Differences to old Java BiBiGrid * BiBiGrid no longer uses RC- but cloud.yaml-files for cloud-specification data. Environment variables are no longer used (or supported). See [Cloud Specification Data](documentation/markdown/features/cloud_specification_data.md). * BiBiGrid has a largely reworked configurations file, because BiBiGrid core supports multiple providers this step was necessary. See [Configuration](documentation/markdown/features/configuration.md) * BiBiGrid currently only implements the provider OpenStack. * BiBiGrid only starts the master and will dynamically start workers using slurm when they are needed. Workers are powered down once they are not used for a longer period. * BiBiGrid lays the foundation for clusters that are spread over multiple providers, but Hybrid Clouds aren't fully implemented yet.

Development

Development-Guidelines

https://github.com/BiBiServ/Development-Guidelines

On implementing concrete providers

New concrete providers can be implemented very easily. Just copy the provider.py file and implement all methods for your cloud-provider. Also inherit from the provider class. After that add your provider to the providerHandler lists; giving it a associated name for the configuration files. By that, your provider is automatically added to BiBiGrid's tests and regular execution. By testing your provider first, you will see whether all provider methods are implemented as expected.