Closed matthewcarbone closed 1 year ago
Remove this line
apt -y install dune
then retry.
dune is supposed to be installed by opam automatically as a dependency of fasmifra.
you should use pip3 instead of pip to install rdkit, to be sure python3 things are being used. Then, fire up a python3 interpreter and check that rdkit is installed properly (import rdkit).
Ok sounds good, let me give this a try. You're right I did not use Dune explicitly during the installation.
@UnixJunkie I have retried again using your instructions, but unfortunately I am running into the same issue. Still lots of fragments such as these
# gen_1k.smi
...
C(Nc0ccc(-n1nc(C2CC2)cc1[*:1][*:11]C1CC1)cc0)(=O)c0ccncc0 genmol_4
...
in the output file.
It might be prudent at this stage for you to check what I've done here. I'm betting others have run into this/similar issues. Could you try to reproduce the steps here on your own machine and see what you get?
First, spin up container:
docker run -ti --rm -v ~/Data/Docker_Share:/data myubuntu /bin/bash
(dockerfile)
FROM ubuntu:22.04
# Disable Prompt During Packages Installation
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y \
vim \
&& rm -rf /var/lib/apt/lists/*
The proceed with the installation
apt update
# apt -y install dune # not doing this!
apt -y install git
git clone https://github.com/UnixJunkie/FASMIFRA.git
cd FASMIFRA
apt -y install opam # Used all defaults
opam init # disabled sandboxing, ok since in container
eval `opam config env`
apt -y install python3-pip # <- using pip3
pip3 install rdkit # <- using pip3
opam install --fake conf-rdkit
opam install fasmifra
Note that rdkit
works fine:
>>> from rdkit import Chem
>>> Chem.MolFromSmiles("CCC")
<rdkit.Chem.rdchem.Mol object at 0xffffb1efb3e0>
And then executed the same script as before.
xzcat data/CHEMBL_100k.smi.xz | head -1000 > chembl_1k.smi
./bin/fasmifra_fragment.py -i chembl_1k.smi -o chembl_1k_frags.smi
fasmifra -f -n 1000 -i chembl_1k_frags.smi -o gen_1k.smi
there is now a install.sh script; I also updated the README. Regards, F.
@UnixJunkie following the instructions in the new install.sh
script has worked!
I'm not really sure what is different between these instructions and what I did before... maybe the order somehow? I'm not precisely sure. Anyways, just for the record so you know exactly what I did (and so I know exactly what to do) 😁:
Dockerfile:
# myubuntu
FROM ubuntu:22.04
# Disable Prompt During Packages Installation
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y \
vim \
&& rm -rf /var/lib/apt/lists/*
Initial commands:
docker run -ti myubuntu /bin/bash
apt update
apt -y install git
git clone https://github.com/UnixJunkie/FASMIFRA.git
cd FASMIFRA
Next set of commands to install everything. Note I had to install python3-pip
and run without sudo
since I'm in a container:
apt install -y opam
opam init -y
apt install python3-pip # Needed to do this first
pip3 install rdkit
eval `opam config env`
opam install --fake conf-rdkit
opam install -y fasmifra
which fasmifra_fragment.py
# /root/.opam/default/bin/fasmifra_fragment.py
which fasmifra
# /root/.opam/default/bin/fasmifra
Running with explicit paths just to be totally sure...
xzcat data/CHEMBL_100k.smi.xz | head -1000 > chembl_1k.smi
/root/.opam/default/bin/fasmifra_fragment.py -i chembl_1k.smi -o chembl_1k_frags.smi
/root/.opam/default/bin/fasmifra -f -n 1000 -i chembl_1k_frags.smi -o gen_1k.smi
And success!
# gen_1k.smi
c0(-c1ccc(Cl)cc1)sc1n(c(COC(=O)C)nn1)n0 genmol_1
COc0c(F)cc(N/N=C(/Cn1c2c(nn1)cccc2)c1ccccc1)cc0F genmol_2
c01c(cc(C(=O)C2CCCN(Cc3cc(F)c(F)cc3)C2)cc0)OCO1 genmol_3
c0(C1(O)OC(=O)C(c2ccc(NC(=O)NCCCC)cc2)=C1Cc1ccccc1)ccc(OC)cc0 genmol_4
C(OC0C(OC(=O)C)C(OC(=O)N2[C@@H]3c4c(c(OC)c(C)c(OCCCC(=O)O)c4OC)C[C@H]2C(=O)N2[C@@H](CN4C(=O)c5c(cccc5)C4=O)c4c(OC)c(OC)c(C)c(OC)c4C=C32)COC0n0c(=S)c(C#N)c(-c1ccccc1)cc0-c0ccc(Cl)cc0)(=O)C genmol_5
C(COc1ccc(Nc2ncnc3cnc(-c4n(C)cnc4)cc32)cc1)N0CCC(NS(=O)(=O)c5c(I)cccc5)CC0 genmol_6
c01c(COC)c(C(O)=O)oc0cccc1 genmol_7
C(C(=O)Nc0ccccc0N0CCN(C(C(C)C)=O)CC0)Oc1ccc2c(c1)C(=O)C(=O)N2 genmol_8
Clc0c(C(=O)NC(Nc1ccc(Cl)cc1)=S)cccc0 genmol_9
C0C1CC2CC0C(OC(C)C)C(C2)C1 genmol_10
N(c0ccc(CC)cc0)c0ccc(F)cc0 genmol_11
Fc0ccc(C(CCCNCCc2ccccc2)c1ccc(F)cc1)cc0 genmol_12
Clc0c(C(=O)NC1CCC(F)(F)C1)cccc0 genmol_13
N0(C)CCc1n(CC)nc(C(=O)NCC(C(O)=O)N)c1C0 genmol_14
CCC(CC)(CN)NC(C)c0ccccc0 genmol_15
c0c(C(c1ccccc1)CN2CCN(c3ccccc3)CC2)cccc0 genmol_16
Clc0ccc(OP(=O)(Oc1ccc(Cl)cc1)[C@@H](C(C)C)NC(=O)[C@H](c1ccc(F)cc1)OC(=O)[C@H](CCSC)Nc2cc(Cl)cc(Cl)c2)cc0 genmol_17
c0c(N/N=C(\C(=O)C)C(c1ccccc1)=O)ccc(OC)c0 genmol_18
CC(=C0C[C@H]1[C@H](C(=C)C)CC[C@]1(C)OC0=O)C genmol_19
Clc0ccc(O)c(/C=N/NS(=O)(=O)c1ccc(Br)cc1)c0 genmol_20
I'm quite happy with this and will almost certainly be using it in some future work. Thank you!
In the container, everything is installed as root ?!
Good that it worked for you.
@UnixJunkie I'm not super familiar with Docker just yet, but I believe so!
Referencing #14 for completeness.
@UnixJunkie I have tried yet another approach to get this working. I'm happy to chat over Zoom but I really believe something is broken with the installation process at this point.
Steps I have taken
I have spun up a fresh Docker environment on my computer.
with a very simple dockerfile
On this fresh environment, I have performed the following steps:
and then executed the steps as laid out in the
test.sh
script:where note that there is nothing installed to
_build
during the make process, so I am using what I believe to be the correct executable in the working directory. This leads to similar fragments as we discussed in #14.Is it possible that there's some difference between your most up-to-date code here and the executable you have on your computer?