DrrDom / crem

CReM: chemically reasonable mutations framework
BSD 3-Clause "New" or "Revised" License
198 stars 38 forks source link

fragmentation problem #17

Open suice07 opened 1 year ago

suice07 commented 1 year ago

Hi,

I am trying to build my own db according to the instructions, but when I try mols = list(mutate_mol(m, db_name='test.db', max_size=3)) in the tutorial with my own db, the result list of mols is always empty, but when I try with the preproduced db replacements02_sa2.db there is no problem; I used the CHEMBL231.smi file in the example folder, and follow the instructions

fragmentation -i CHEMBL231.smi -o frags.txt -c 32 -v
frag_to_env -i frags.txt -o r3.txt -r 3 -c 32 -v
sort r3.txt | uniq -c > r3_c.txt
env_to_db -i r3_c.txt -o tert.db -r 3 -c -v

got the result

 root@872fabd400c3:/home/crem# python test.py
[]

is there something iI missed?

DrrDom commented 1 year ago

Hi, the described workflow seems OK with the exception of DB name (tert.db instead of test.db). If everything else is correct the issue may be in the structure of your molecule m - it may happen that it has all contexts of radius 3 which are not available in the test.db (for such a small DB it is possible). Try to generate DB with radius 2 or 1, would it help or not?

suice07 commented 1 year ago

Hi,

I tried to generate db using radius 2 or 1, but when they are finished and excute mols = list(mutate_mol(m, db_name='test.db', max_size=3)), there occurs some error

 File "test.py", line 13, in <module>
    mols = list(mutate_mol(m, db_name='zink.db', max_size=2))
  File "/home/crem/crem/crem.py", line 487, in mutate_mol
    for frag_sma, core_sma, freq, ids in __gen_replacements(mol1=mol, mol2=None, db_name=db_name, radius=radius,
  File "/home/crem/crem/crem.py", line 344, in __gen_replacements
    row_ids = __get_replacements_rowids(cur, env, dist, min_atoms, max_atoms, radius, min_freq, **kwargs)
  File "/home/crem/crem/crem.py", line 286, in __get_replacements_rowids
    db_cur.execute(sql)
sqlite3.OperationalError: no such table: radius3

I don't get it, why it is still searching for the raduis3 table, is there some cache that I missed?

I generate the db as follows:

fragmentation -i CHEMBL231.smi -o frags.txt -c 32 -v
frag_to_env -i frags.txt -o r2.txt -r 2 -c 32 -v
sort r2.txt | uniq -c > r2_c.txt
env_to_db -i r2_c.txt -o test.db -r 2 -c -v
DrrDom commented 1 year ago

You have to pass a variable radius=2 in mutate_mol function. Just a tip: you may store tables with different radius in the same DB.

suice07 commented 1 year ago

oh, sorry. I tried with radius =2, it is working,maybe with such a small db,radius =3 is too much .Thanks so much for the help!!

suice07 commented 1 year ago

image sorry to bother again, I have some other problems while using the mutate, the original molecule looks like above, smiles format in 'NC(=N)c4ccc3[nH]c(c2cc(Cl)cc(c1cccc(N)c1)c2O)nc3c4',I set the protected ids to [9, 10, 11, 12, 13, 14, 22, 23], image so, theoretically the marked part will stay the same. image but I got a lot of results like this, but image this part has never been changed. I set the radius to 3 ,using the databank I produced from the zink250(radius also set to 3), did I miss some settings?

DrrDom commented 1 year ago