mSorok / NaturalProductsOnline

Website code for COCONUT
https://coconut.naturalproducts.net/
33 stars 11 forks source link

The file coconut_db.sdf can't split into small file, it only contain 2 compounds? #78

Closed alongdedidi closed 2 years ago

alongdedidi commented 2 years ago
12
steinbeck commented 2 years ago

A bit more information is always helpful when filing an issue. :) Where did you get this screenshot from, or better, from where did you download the coconut_db.sdf? I actually downloaded the coconut_db.sdf today from https://coconut.naturalproducts.net/download and it contains 406747 SDF entries (molecules).

One more check on this: I ran obabel to parse my freshly downloaded coconut_db.sdf and convert all entries into inchikeys for fun (I am omitting some 400k lines for brevity :)):

% obabel -i sdf COCONUT_DB.sdf -o inchikey YPIPZZKRLNAIQI-UHFFFAOYSA-N FJEMIESGEMWDOB-UHFFFAOYSA-N KLWKJVYCDFWQMK-UHFFFAOYSA-N PTEKHLCNKCAXPH-UHFFFAOYSA-N ZVAVQCZAGOKAMX-UHFFFAOYSA-N UYIPOCQHTAYRMA-UHFFFAOYSA-N YEXGCFBQFMYIMF-UHFFFAOYSA-N RXBUYPIJHADATL-UHFFFAOYSA-N SKPUHXLYHIKKLH-UHFFFAOYSA-N ZGPVLXMBBBNMMX-UHFFFAOYSA-N JMIUTOPIWSGIAG-UHFFFAOYSA-N SAMXOERPOHKPHJ-UHFFFAOYSA-N CNGZYBVXYGCWJB-UHFFFAOYSA-N ZZCKHTCZYKQTPW-UHFFFAOYSA-N ATSPGPYEGAPMOB-UHFFFAOYSA-N CTECTDAKIKVAIQ-UHFFFAOYSA-N WEBRWXVGMXDMGZ-UHFFFAOYSA-N NODVMLZNRKEQEG-UHFFFAOYSA-N NOFLAMAMSCCPHR-UHFFFAOYSA-N TWDDLXSXMWFAMM-UHFFFAOYSA-N ADYGBTOIDDRJBU-UHFFFAOYSA-N ZKIUQLSTRRFWTL-UHFFFAOYSA-N XPIBBCMJKDDJDY-UHFFFAOYSA-N BPMBJHGRMIYIMZ-UHFFFAOYSA-N YGLORTPCIYSYJE-UHFFFAOYSA-N RCOQLIMTZZDCIP-UHFFFAOYSA-N NQKMVCQBCHLRMH-UHFFFAOYSA-N NPZJYDRURCXSHV-UHFFFAOYSA-N HCBJVFDUUCPXGN-UHFFFAOYSA-N CTCVCHNZEOOKDE-UHFFFAOYSA-N VJTAVDGKPYOHRD-UHFFFAOYSA-N HUZXFTOJSJNGAH-UHFFFAOYSA-N ATHUMIFMCBNWTQ-UHFFFAOYSA-N CGGQJNOEEHACAZ-UHFFFAOYSA-N HYFWJORQMIVRSD-UHFFFAOYSA-N JTWOBQYRTXVIEC-UHFFFAOYSA-N HJRDVDMZBAHNCY-UHFFFAOYSA-N KMANCXYQEHIDCC-UHFFFAOYSA-N CGYNNONCMUXGEH-UHFFFAOYSA-N MXMBNQJGJFSEPP-UHFFFAOYSA-N LZYNDMZLGOSHSZ-UHFFFAOYSA-N YVTMAXBPKFJHLW-UHFFFAOYSA-N PNVRNENCSPHIML-UHFFFAOYSA-N YVOKPXWXLWYSMI-UHFFFAOYSA-N WPLHQARBFZMPTI-UHFFFAOYSA-N BAEUPHPDJDLQKX-UHFFFAOYSA-N YROXLKZGAZSYJV-UHFFFAOYSA-N RBSIVFPWLZPNGB-UHFFFAOYSA-N JEAPTBQDYOJQDQ-UHFFFAOYSA-N SLZVZYBXARPMKC-UHFFFAOYSA-N RZYUJCKPJOOZBL-UHFFFAOYSA-N GKLHAFKNFLHTFO-UHFFFAOYSA-N YMSRDLCFCMPXGQ-UHFFFAOYSA-N

alongdedidi commented 2 years ago

3Q for your reply, but I truly downloaded the coconut_db.sdf today from https://coconut.naturalproducts.net/download. could you please tell me how to split the file, or give me a link about obabel .

steinbeck commented 2 years ago

Funny that you ask that :) I just wanted to write to you exactly this:

obabel -i sdf COCONUT_DB.sdf -o mol -O coco.mol -m

This will give you coco1.mol, coco2.mol, etc

But: I also noted that my download is larger: 3048817341 Dec 2 11:19 COCONUT_DB.sdf. Maybe yours was not complete? I just filed an issue today, btw, that the SDF's should be zipped.

alongdedidi commented 2 years ago

Funny that you ask that :) I just wanted to write to you exactly this:

obabel -i sdf COCONUT_DB.sdf -o mol -O coco.mol -m

This will give you coco1.mol, coco2.mol, etc

But: I also noted that my download is larger: 3048817341 Dec 2 11:19 COCONUT_DB.sdf. Maybe yours was not complete? I just filed an issue today, btw, that the SDF's should be zipped.

Funny that you ask that :) I just wanted to write to you exactly this:

obabel -i sdf COCONUT_DB.sdf -o mol -O coco.mol -m

This will give you coco1.mol, coco2.mol, etc

But: I also noted that my download is larger: 3048817341 Dec 2 11:19 COCONUT_DB.sdf. Maybe yours was not complete? I just filed an issue today, btw, that the SDF's should be zipped.

i'm sorry, i'm a student major in phytochemistry. I don't know much about chemoinformatics; I want the single file with SDF format, professor can you give me some advice ?

steinbeck commented 2 years ago

SDF and mol files are the same format. So if you want to split the big coconut_db.sdf into individual molecules, you need to run the command that I gave you (obabel -i sdf COCONUT_DB.sdf -o sdf -O coco.sdf -m). Openbabel can be downloaded from http://openbabel.org/. If you have further issues, send me an email to christoph.steinbeck@uni-jena.de since the thread is no longer related to a problem with the coconut DB.

alongdedidi commented 2 years ago

Thank you very much professor (^_^)

alongdedidi commented 2 years ago

btw, professor the 3048817341 bit/1024= 2977361 kB (^_^)