mlederbauer / NMRcraft

Crafting Catalysts from NMR Features, Ligand by Ligand
MIT License
7 stars 1 forks source link

nmrcraft_logo

Release Build status codecov Commit activity License

NMRcraft

Crafting Catalysts from NMR Features

NMRcraft is a project that predicts ligands of complexes from their chemical shift tensors.

🐳 Installation

See installation instructions ## Docker Desktop 🐳 First you need to install [Docker](https://www.docker.com/products/docker-desktop/). ### Download Docker Image You can download the image by going onto the searchbar on top and searching for 'tiaguinho/nmrcraft_arch' and clicking on pull. ### Running the Image To run the image you need to go to the 'Images' tab and click the "play" button on the nmrcraft*arch container you pulled. It should appear as running in the 'Containers' tab and there you should click on the ⋮ symbol and click on '>* open in termnial'. After that a terminal window should pop up where you will type in the command `zsh`. ## Console 🐧 ### Download Docker Image To use the docker image, pull it from [Docker Hub](https://hub.docker.com/r/tiaguinho/nmrcraft_arch) and make sure that [Docker](https://www.docker.com/products/docker-desktop/) is installed. To pull it you can execute this command: ```bash docker pull tiaguinho/nmrcraft_arch ``` (If running on windows, you might need to call docker.exe instead of just docker) ### Running the Image ```bash docker run -it nmrcraft_arch ``` ## Visual Studio Code 🪟 To download the image, follow the same steps as either console or docker desktop. ### Running the Docker Image
Using Docker in VS Code
  1. Open VS Code and install the extensions for Docker and Dev Containers.
  2. Go to the newly added Docker Tab. Here you should now see three sections: Containers, Images and Registries. And under Images the tiaguinho/nmrcraft_arch image should be visible.
  3. In order for the container not to be deleted every time you stop it we have to remove the --rm commad. For this go to the settings (Ctrl + , on Mac) and type `docker run`. Select 'Edit the settings.json' for the 'Run Interactive' command and remove the --rm to get: "docker.commands.runInteractive": "${containerCommand} run -it ${exposedPorts} ${tag}", "docker.commands.run": "${containerCommand} run -d ${exposedPorts} ${tag}". Save the file.
  4. In the Docker Tab on the right, right click on the image and select run interactive. Now a conainer should appear in the Container section. Right click on it and select stop to start it back up.
  5. Right click again on the container and select start to start it back up.
  6. Right click again on the container and select attach Visual Studio Code. A new VS Code window should apear, this window is now fully in the container. If necessary, switch to `/home/steve/NMRcraft`.
  7. Pull the latest changes to the repository with `git pull origin main`.
  8. Have fun developing.
## Getting Access to the Dataset 💾 For the script to be able to access the dataset, you must login via to huggingface by using the following command: ```bash pip install -U "huggingface_hub[cli]" # if not installed already huggingface-cli login # log in after generating an authentification token for huggingface ``` We include the link to be authenticated in the report appendix. If you run into issues accessing the dataset, contact [mlederbauer@ethz.ch](mlederbauer@ethz.ch).

🔥 Usage

To reproduce all results shown in the report, run the following commands:

poetry shell
python scripts/reproduce_results.py

This script will interatively

When the parameter max_eval is set to a high value such as 50, expect the whole process to take about two hours. Alternatively – which results in worse model performance –, max_eval can be set to a low value such as 2 for testing. Run scripts/training/{one_target,multi_targets}.sh for running individual pipelines (although running scripts/reproduce_results.py is recommended). Results are also accessible via the polybox here.

🖼️Poster

If you were not able to visit our beautiful poster at ETH Zurich on May 30th 2024, you can access our poster here!

Poster

🧑‍💻 Developing

See developer instructions ### Activate the Poetry venv To use the packages installed via poetry you need to execute the following command: ```bash poetry shell ``` This will put you into the poetry shell from where you have direct access to all packages managed by poetry. ### GitHub pushing auth To authenticate the Docker comes with the github cli application. To login execute this command: ```bash gh auth login ``` and follow the interactive instructions with enter and the arrow keys. Once logged in you should be able to push changes to the repo. ### Adding packages and libraries to the project If you added a new feature that requires a new package/library, you can add by running `poetry add ` and run `make install` to install the new dependencies. (You might need to run `poetry lock` to update the `poetry.lock` file if you added a dependency manually in the `pyproject.toml` file.) ### Loading the Data The dataset is stored in a private repository on HuggingFace. To download the dataset on the Hub in Python, you need to log in to your Hugging Face account: ```bash huggingface-cli login ```

Citation

@software{nmrcraft2024,
  author       = {Magdalena Lederbauer and Karolina Biniek and Tiago Würthner and Samuel Stricker and Yingnan Wang},
  title        = {{mlederbauer/NMRcraft: Crafting Catalysts from NMR Features}},
  month        = may,
  year         = 2024
}

Repository initiated with fpgmaas/cookiecutter-poetry.