BlueObelisk / jumbo-converters

Converters for legacy to and from CML
Apache License 2.0
9 stars 2 forks source link

Missing dependencies #3

Open ostueker opened 4 years ago

ostueker commented 4 years ago

Initial post:

I've just had a glance over the dependencies and noticed a few where I'm not sure whether we have everything that's needed:

chemdraw-converter

Found it in @petermr 's Bitbucket account at https://bitbucket.org/petermr/chemdraw-converter and because it's a Mercurial repo, I've imported it to the BlueObelisk org.

jbabel

This maven artifact has the group-ID sea36 so hopefully @sea36 has an idea where it comes from.

Searching a bit I found a blog post from 2007 that links to a location where one can download a jar.

I'm not sure whether it's the same thing, though.

jumbo-units

Here I have no idea what it is or where to find it. 🤔

petermr commented 4 years ago

On Sun, Jan 5, 2020 at 12:02 AM Oliver Stueker notifications@github.com wrote:

I've just had a glance over the dependencies and noticed a few where I'm not sure whether we have everything that's needed: chemdraw-converter

Found it in @petermr https://github.com/petermr 's Bitbucket account at https://bitbucket.org/petermr/chemdraw-converter and because it's a Mercurial repo, I've imported it to the BlueObelisk org.

Thanks

jbabel

This maven artifact has the group-ID sea36 so hopefully @sea36 https://github.com/sea36 has an idea where it comes from.

Searching a bit I found a blog post from 2007 https://depth-first.com/articles/2007/12/10/run-babel-anywhere-java-runs-with-jbabel/ that links to a location https://sourceforge.net/projects/rxf/files/jbabel/jbabel-20071209/ where one can download a jar https://sourceforge.net/projects/rxf/files/jbabel/jbabel-20071209/jbabel-20071209.jar .

I'm not sure whether it's the same thing, though.

Suggest posting on bue-obelisk and see if anyone remembers. My guess is it should be ignored an we should look for other Babel functionality to replace it - e.g. running openbabel as a Java Process.

jumbo-units

Here I have no idea what it is or where to find it. 🤔

We did units because NIST was so slow in coming up with units in XML. We would be much better off now using Wikidata as the whole world will maintain that.

In general we should try to find communal ways of supporting this functionality - e.g. much of Euclid could be found elsewhere. In Python this would undoubtedly work but Java is more scattered. Science and (some) maths is patchy.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BlueObelisk/jumbo-converters/issues/3?email_source=notifications&email_token=AAFTCS3AP3FC4U4RKZH75TDQ4EPR7A5CNFSM4KCYRPO2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IEBLC2Q, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS6CE55UUYZGXY3EHC3Q4EPR7ANCNFSM4KCYRPOQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

ostueker commented 4 years ago

Thanks Peter, and I totally agree that it is worthwhile to replace some old and outdated packages with more modern and more complete and robust ones.

However right now my main aim is to get to the point where everything compiles to make sure we don't miss any critical components and that without making too many (or any) changes in the existing code base. I'd like to have the old test suite passing before starting to replace one library by another.

ostueker commented 4 years ago

Ha! I just found the source code for jumbo-units and a whole list of other tools within the CML project on sourceforge.net.

There’s even the chemdraw-converter including it’s commit history. I can probably do a re-import from there.

petermr commented 4 years ago

Brilliant, Yes, we started on Sourceforge (SVN and before that CVS...) Sourceforge became unusable for several reasons. It's really good that your are pulling this together.

I quite agree the initial task is to get everything working.

FWIW my main activities in the time since Bitbucket are now (I think) coalesced into GH/petermr/ami3. Almost everything else on GH/petermr is either an early version (and now subsumed into ami3) or a standalone analysis project including data.

I am not aware of other repos of PMR code waiting to be discovered. The PMRGroup has

Any JUMBO earlier than 6 is for historians only.

When this is compiling and working I think the next thing is to find and highlight (showcase) working examples of what the codes do. In many cases this is hopefully just identifying some key Tests.

On Sun, Jan 5, 2020 at 2:51 PM Oliver Stueker notifications@github.com wrote:

Ha! I just found the source code for jumbo-units and a whole list of other tools within the CML project on sourceforge.net https://sourceforge.net/p/cml/code/HEAD/tree/.

There’s even the chemdraw-converter including it’s commit history. I can probably do a re-import from there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BlueObelisk/jumbo-converters/issues/3?email_source=notifications&email_token=AAFTCS2XLDZ3DQ3M5VNN5MLQ4HXWDA5CNFSM4KCYRPO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIDYSJY#issuecomment-570919207, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS3YG47IYDGRTJI5XJDQ4HXWDANCNFSM4KCYRPOQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

ostueker commented 4 years ago

For the first time in probably more than 3 years, I was able to compile the jumbo-converters but not without deactivating some modules and making short dives into the source code.

I found a few more problematic dependencies:

owlapi

I can't find g:owlapi a:owlapi but the project seems to be on Maven Central these days: https://github.com/owlcs/owlapi/. Need to try our which sub-packages are needed and which version works with our code. For now I have deactivated jumbo-converters-rdf (depends on owlapi) and jumbo-converters-reaction (needs JC-rdf).

lensfield2

I found some lensfield code on Bitbucket but that seems to be an older version (v1 not v2).
However I was able to remove the dependency from the code (mostly annotations like @LensfieldParameter ). From what I saw, their purpose was to run the jumbo-converters within the lensfield framework. My guess: not a big loss.

osra-runner (group id uk.ac.cam.ch.osra-runner)

I have deactivated jumbo-converters-graphics (depends on osra-runner) and jumbo-converters-spectrum (needs JC-graphics). In an old osra-runner-parent POM that I found in some old maven-cache, I noticed that @sea36 seems to be the developer on this.
Sam, can you provide some more insight on what it does and whether the code-base still exists?

semsci-chemistry (group id gigadot.semsci)

This is a dependency for jumbo-converters-compchem-misc. I believe it's also RDF-related but I haven't tried to read the code to find out what it does.

Again: no public code to be found anywhere.

petermr commented 4 years ago

On Sun, Jan 5, 2020 at 8:45 PM Oliver Stueker notifications@github.com wrote:

For the first time in probably more than 3 years, I was able to compile the jumbo-converters

Well done! (I suspect some of the problems were that missing modules were probably locally present in Cambridge and not available globally even though we probably assumed they were).

but not without deactivating some modules and making short dives into the source code.

I found a few more problematic dependencies: owlapi

I can't find g:owlapi a:owlapi but the project seems to be on Maven Central these days: https://github.com/owlcs/owlapi/. Need to try our which sub-packages are needed and which version works with our code. For now I have deactivated jumbo-converters-rdf (depends on owlapi) and jumbo-converters-reaction (needs JC-rdf).

I suspect this relates to OWL-RDF - I don't think this is likely to be useful in the present. Archive without integration.

lensfield2

I found some lensfield code on Bitbucket https://bitbucket.org/lensfield/ but that seems to be an older version (v1 not v2). However I was able to remove the dependency from the code (mostly annotations like @LensfieldParameter ). From what I saw, their purpose was to run the jumbo-converters within the lensfield framework. My guess: not a big loss.

No. WE have a MUCH better system now, picocli,net , and any CLI or Framework should be (easily) converted to that.

osra-runner (group id uk.ac.cam.ch.osra-runner)

I have deactivated jumbo-converters-graphics (depends on osra-runner) and jumbo-converters-spectrum (needs JC-graphics). In an old osra-runner-parent POM that I found in some old maven-cache, I noticed that @sea36 https://github.com/sea36 seems to be the developer on this. Sam, can you provide some more insight on what it does and whether the code-base still exists?

I remember the name - can't remember details. I am sure we have better approaches.

semsci-chemistry (group id gigadot.semsci)

This is a dependency for jumbo-converters-compchem-misc. I believe it's also RDF-related but I haven't tried to read the code to find out what it does.

Again: no public code to be found anywhere.

It's created by/for Weerapong from Chemical Engineering (Marcus Kraft). I suspect it linked to his repo. Not needed.

Sounds great.

At some stage we 'll need to look at how the tests perform and almost certainly triage some.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BlueObelisk/jumbo-converters/issues/3?email_source=notifications&email_token=AAFTCSYVTY4WKL42AEOVJHDQ4JBFVA5CNFSM4KCYRPO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEID7DOI#issuecomment-570945977, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS4KVXB5EGMLYEE74VLQ4JBFVANCNFSM4KCYRPOQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

sea36 commented 4 years ago

Hi Everyone,

It is nice to see that some of our work is still being used!

I've not looked at any of this code for a few years now, but can give you pointers to a few for the repos you're looking for:

jbabel This was a tool for calling OpenBabel from java - either via JNI, or invoking the command line. https://sourceforge.net/p/jnati/code/HEAD/tree/jbabel/trunk/src/main/java/sea36/jbabel/

osra-runner This was a tool for running OSRA from java. https://bitbucket.org/seadams/osra-runner https://bitbucket.org/seadams/osra-runner/src/

lensfield2 This was an experiment to create a repeatable 'make' like pipeline for data processing. It never got much use - was mostly built for the Green Chain Reaction project. As Peter says, I'm sure there are much better tools around now. https://bitbucket.org/seadams/lensfield2/

Hope that helps,

Sam

On Sun, 5 Jan 2020 at 20:45, Oliver Stueker notifications@github.com wrote:

For the first time in probably more than 3 years, I was able to compile the jumbo-converters but not without deactivating some modules and making short dives into the source code.

I found a few more problematic dependencies: owlapi

I can't find g:owlapi a:owlapi but the project seems to be on Maven Central these days: https://github.com/owlcs/owlapi/. Need to try our which sub-packages are needed and which version works with our code. For now I have deactivated jumbo-converters-rdf (depends on owlapi) and jumbo-converters-reaction (needs JC-rdf). lensfield2

I found some lensfield code on Bitbucket https://bitbucket.org/lensfield/ but that seems to be an older version (v1 not v2). However I was able to remove the dependency from the code (mostly annotations like @LensfieldParameter ). From what I saw, their purpose was to run the jumbo-converters within the lensfield framework. My guess: not a big loss. osra-runner (group id uk.ac.cam.ch.osra-runner)

I have deactivated jumbo-converters-graphics (depends on osra-runner) and jumbo-converters-spectrum (needs JC-graphics). In an old osra-runner-parent POM that I found in some old maven-cache, I noticed that @sea36 https://github.com/sea36 seems to be the developer on this. Sam, can you provide some more insight on what it does and whether the code-base still exists? semsci-chemistry (group id gigadot.semsci)

This is a dependency for jumbo-converters-compchem-misc. I believe it's also RDF-related but I haven't tried to read the code to find out what it does.

Again: no public code to be found anywhere.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BlueObelisk/jumbo-converters/issues/3?email_source=notifications&email_token=AAGCKR3Q3OKBW7SIPXWETK3Q4JBFVA5CNFSM4KCYRPO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEID7DOI#issuecomment-570945977, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGCKR23MG25RY5PFIY3373Q4JBFVANCNFSM4KCYRPOQ .

petermr commented 4 years ago

Thanks Sam - can you remind us what OSRA did?

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

sea36 commented 4 years ago

OSRA is an NIH/NCI tool for converting images of chemical structures to SMILES:

https://cactus.nci.nih.gov/osra/

On Sun, 5 Jan 2020 at 22:41, petermr notifications@github.com wrote:

Thanks Sam - can you remind us what OSRA did?

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BlueObelisk/jumbo-converters/issues/3?email_source=notifications&email_token=AAGCKR2657H4ZNFVONWTZI3Q4JO2HA5CNFSM4KCYRPO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIEBNYI#issuecomment-570955489, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGCKRYKO5GOMPYZP3QZT7TQ4JO2HANCNFSM4KCYRPOQ .

petermr commented 4 years ago

Thanks - of course. So this is a JNI-type application. I am making progress on my own bitmap code which might at some stage be an alternative. OSRA became almost closed - you either had to compile C (not trivial) or buy an EXE.

On Sun, Jan 5, 2020 at 10:49 PM Sam Adams notifications@github.com wrote:

OSRA is an NIH/NCI tool for converting images of chemical structures to SMILES:

https://cactus.nci.nih.gov/osra/

On Sun, 5 Jan 2020 at 22:41, petermr notifications@github.com wrote:

Thanks Sam - can you remind us what OSRA did?

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/BlueObelisk/jumbo-converters/issues/3?email_source=notifications&email_token=AAGCKR2657H4ZNFVONWTZI3Q4JO2HA5CNFSM4KCYRPO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIEBNYI#issuecomment-570955489 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAGCKRYKO5GOMPYZP3QZT7TQ4JO2HANCNFSM4KCYRPOQ

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BlueObelisk/jumbo-converters/issues/3?email_source=notifications&email_token=AAFTCS3HRAV2UXFDR6SIJ7DQ4JPWXA5CNFSM4KCYRPO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIEBSOQ#issuecomment-570956090, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSZYJA27PS6XRX7WHGLQ4JPWXANCNFSM4KCYRPOQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

ostueker commented 4 years ago

Hi Sam, thank you for the hints.

Interestingly I didn’t see any errors regarding the missing jbabel. So either it’s no longer needed or it’s needed by one of the modules that I had to deactivate for another reason.

The lensfield2 and osra-runner packages are Mercurial repos on Bitbucket and therefore subject to be deleted in a few months.

Sam, Peter, should we import them to this BlueObelisk org on GitHub and treat them along with the other WWMM repos? None of them has a License.txt file in the root of the repo, nor do they define any license in the main POM. Should those be under Apache 2.0 as well?

Oliver

petermr commented 4 years ago

On Mon, Jan 6, 2020 at 12:50 AM Oliver Stueker notifications@github.com wrote:

Hi Sam, thank you for the hints.

Interestingly I didn’t see any errors regarding the missing jbabel. So either it’s no longer needed or it’s needed by one of the modules that I had to deactivate for another reason.

Thanks. No action needed. If it's mission-critical it can be re-engineered

The lensfield2 and osra-runner packages are Mercurial repos on Bitbucket and therefore subject to be deleted in a few months.

If they can be simply copied and archived that would be fine. Perhaps we can add a flag to each repo giving its status: e.g. INTEGRATED, TESTED, ARCHIVED.

Sam, Peter, should we import them to this BlueObelisk org on GitHub and treat them along with the other WWMM repos?

Yes,

None of them has a License.txt file in the root of the repo, nor do they define any license in the main POM. Should those be under Apache 2.0 as well?

I favour this and I don't think there is anyone else who will be upset.

Again, many thanks

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK