docking-org / cartblanche22

A molecule shopping cart and ZINC-22 search tool
https://cartblanche22.docking.org
0 stars 0 forks source link

Downloading local dump for ZINC22 #73

Open YojanaGadiya opened 1 year ago

YojanaGadiya commented 1 year ago

Dear team,

Is there a way to download the Postgres database of ZINC locally? I have been looking for this dump on the Wiki page but wasn't successful in retrieving the location where I can directly download the dump. Please can you provide some guidance on this?

Thank you.

jir322 commented 1 year ago

Hi Yojana

ZINC-22 is 170-180 postgres databases. It is a full-time job just keeping it running where it is installed. What problem are you trying to solve, please?

John

John Irwin UCSF Pharmaceutical Chemistry http://irwinlab.compbio.ucsf.edu

On Tue, Jun 6, 2023 at 6:31 AM Yojana Gadiya @.***> wrote:

Dear team,

Is there a way to download the Postgres database of ZINC locally? I have been looking for this dump on the Wiki page but wasn't successful in retrieving the location where I can directly download the dump. Please can you provide some guidance on this?

Thank you.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

YojanaGadiya commented 1 year ago

Dear John Irwin,

I have a set of compounds and want to know how many of the compounds in the list are available commercially and with what vendors. There all have chembl/pubchem ids and I am trying to find a way to connect the two resources.

Regards, Yojana Gadiya

jir322 commented 1 year ago

Hi Yojana

I would use the smiles search feature via curl in cartblanche22. This is currently only searching zinc-22, which is probably not what you want (these are the make on demand 37 billion set) We can add in zinc20 to the smiles search in cartblanche22, but it will probably take us a day. We recommend searching using curl and sending max 100 mols at a time. How many mols do you want to search? I figure it might take some time.

Here's another idea. Download ZINC20 as smiles. Do your own matchup using RDKit and either inchikey or FPs. then when you have ZINC IDs, you can retrieve info from us programmatically.

there is a smiles lookup in zinc20 already, but it will be too slow for you unless you are only looking up a few hundred molecules.

Please let us know. We are keen to support this option, taxing as it can be on our systems. If you figure out a good way, please let us know. also let us know if we can make small developments to make it easier or more robust.

John Irwin UCSF Pharmaceutical Chemistry http://irwinlab.compbio.ucsf.edu

On Wed, Jun 7, 2023 at 1:24 AM Yojana Gadiya @.***> wrote:

Dear John Irwin,

I have a set of compounds and want to know how many of the compounds in the list are available commercially and with what vendors. There all have chembl/pubchem ids and I am trying to find a way to connect the two resources.

Regards, Yojana Gadiya

— Reply to this email directly, view it on GitHub https://github.com/docking-org/cartblanche22/issues/73#issuecomment-1580186410, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIR2H3XG4U4OZGIH6NREYLXKA3E7ANCNFSM6AAAAAAY4NR57Q . You are receiving this because you commented.Message ID: @.***>

YojanaGadiya commented 1 year ago

Dear John,

Thank you for your comment. We are looking at 20,000 compounds at the moment. Indeed I can start with ZINC20. Are there any statistics on how much difference there is between ZINC20 and ZINC22?

Also, do you have future plans to create Python packages for licenced users? That could be a way for programmers to query the resource quickly.

Regards, Yojana Gadiya

jir322 commented 1 year ago

just download the 20,000 you need and work with them (?) statistics. ZINC20 has about 1B purchasable and 490M lead-like ready to dock in 3D ZINC22 has about 37 B purchasable and about 4.5 B lead-like ready to dock in 3D . We are not averse to making python packages, Indeed, we have done so in the past. The problem is not the python package, but the potential load its use puts on the backend database that supports it. Perhaps you could enumerate a few problems you are trying to solve and we could see whether there is a way we can help you using the available (and supportable) resources. Thank you

John Irwin UCSF Pharmaceutical Chemistry http://irwinlab.compbio.ucsf.edu

On Wed, Jun 7, 2023 at 10:52 PM Yojana Gadiya @.***> wrote:

Dear John,

Thank you for your comment. We are looking at 20,000 compounds at the moment. Indeed I can start with ZINC20. Are there any statistics on how much difference there is between ZINC20 and ZINC22?

Also, do you have future plans to create Python packages for licenced users? That could be a way for programmers to query the resource quickly.

Regards, Yojana Gadiya

— Reply to this email directly, view it on GitHub https://github.com/docking-org/cartblanche22/issues/73#issuecomment-1581930548, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIR2H6Q5NUAV4T4XXTOM53XKFR7NANCNFSM6AAAAAAY4NR57Q . You are receiving this because you commented.Message ID: @.***>