pemsley / coot

Software for macromolecular model-building
http://www2.mrc-lmb.cam.ac.uk/personal/pemsley/coot/
GNU General Public License v3.0
114 stars 44 forks source link

Official Debian package #120

Open alexmyczko opened 4 months ago

alexmyczko commented 4 months ago

There's the intent to package at https://bugs.debian.org/897673 and the current public state: https://salsa.debian.org/science-team/coot and then there's some success on a local machine that builds 1.1.07 and if software builds successfully, it's likely to also work and be packagable... (Using this place to keep track of the history of building the package)

correction: almost success

[100%] Linking CXX executable test-molecules-container
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libssm.so: undefined reference to `mmdb::Atom::Transform(double (&) [4][4])'
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libssm.so: undefined reference to `mmdb::Mat4Copy(double (&) [4][4], double (&) [4][4])'
collect2: error: ld returned 1 exit status

Just to see where we are: https://repology.org/project/coot/versions (comparisons are bad, Paul Watzlawik, Situation is Hopeless, But Not Serious, The Pursuit of Unhappiness)

pemsley commented 3 months ago

https://salsa.debian.org/science-team/coot/-/blob/master/debian/patches/compare-dictionaries.patch https://salsa.debian.org/science-team/coot/-/blob/master/debian/patches/coot-make-shelx-restraints.patch are not needed I think.

Also lidia and dynarama are no longer built.

merkys commented 3 months ago

https://salsa.debian.org/science-team/coot/-/blob/master/debian/patches/compare-dictionaries.patch https://salsa.debian.org/science-team/coot/-/blob/master/debian/patches/coot-make-shelx-restraints.patch are not needed I think.

Thanks for reviewing. Most of the patches are no longer applied (only those which are not commented out in https://salsa.debian.org/science-team/coot/-/blob/master/debian/patches/series).

pemsley commented 3 months ago

OK. I have remove CPPFLAGS from swig usage.

merkys commented 3 months ago

OK. I have remove CPPFLAGS from swig usage.

Right, I was about to write concerning that. Apparently swig does not understand everything what goes to CPPFLAGS. Maybe GUILE_CPPFLAGS or SWIG_CPPFLAGS are more appropriate here? I think I have seen these variables used in other Makefiles of coot.

merkys commented 3 months ago

I have spent some time investigating how to compile the debian/copyright file and I came to a conclusion that the easiest approach would be the following: add an all-encompassing entry on top of debian/copyright:

Files: *
Copyright: 2001-2007, The University of York
 2007-2009, The University of Oxford
 2010-2016, Medical Research Council
 2001-2024, Paul Emsley
License: GPL-3+

and then list all the files which have different licenses and/or copyright holders below. The number of such files seems doable. @pemsley is this OK with you? By the way, one more directory, protein_db/, contains files with "all rights reserved" notices.

Current debian/copyright is far from ideal at the moment, but I am slowly getting to what I would like to have.

pemsley commented 3 months ago

The cross-over between Oxford and Medical Research Council was 2012 (not 2010).

Why do you prefer this new proposal over what already exists?

I don't understand what "all rights reserved" means and what makes it an issue. Is it because there is no license? If/when I get the go-ahead, I will change the notice for files in cootilus, cootaneer and protein_db.

paul.emsley@bioch.ox.ac.uk is a dead email address

Source: http://coot.googlecode.com/svn/trunk/ is dead.

merkys commented 3 months ago

The cross-over between Oxford and Medical Research Council was 2012 (not 2010).

Thanks, fixed.

Why do you prefer this new proposal over what already exists?

Current debian/copyright was auto-generated a bunch of years ago. It badly needs updating and decrufting in order to be maintainable. Having a concise debian/copyright would be easier to update with subsequent coot releases, as newly added files will fall under top-level Files: * rule and only the ones with different copyright owners/licenses will have to be added separately. Please let me know should you have any objections to this.

I don't understand what "all rights reserved" means and what makes it an issue. Is it because there is no license? If/when I get the go-ahead, I will change the notice for files in cootilus, cootaneer and protein_db.

Some of my packages were rejected by Debian due to unclear licensing of individual files, but now I cannot trace back any case when it happened due to "all rights reserved" notice. So maybe there is nothing wrong with it, just me being overly cautious. Let us keep the current license statements for now and see how the Debian's copyright review goes. I am not a lawyer, though.

paul.emsley@bioch.ox.ac.uk is a dead email address

Source: http://coot.googlecode.com/svn/trunk/ is dead.

I have replaced these with links to GitHub. This seems to be the usual practice in Debian for GitHub-based projects, but let me know if you prefer other addresses.

pemsley commented 3 months ago

There will be action on the files in the various directories mentioned above in the next few days.

alexmyczko commented 3 months ago

once it works on debian, I will try a homebrew recipe for macOS :)

pemsley commented 3 months ago

once it works on debian, I will try a homebrew recipe for macOS :)

Thas has been under discussion for some time: https://github.com/pemsley/coot/issues/33

I also have a GitHub Action that tries to build using it.

merkys commented 3 months ago

I managed to successfully build the package on Debian. The remaining tasks for Debian package:

Now about shared libraries. Coot is dynamically linked against a bunch of its own shared libraries (libcoot* and some other names). In Debian these will have to be packaged either as public or private shared libraries:

pemsley commented 3 months ago

Running the tests in python-tests requires additional data files. One can run less extensive internal tests that do not require additional data. Or that used to be the case - it seem that I need to reactivate that option.

What needs to be done for the CMake part?

Fonts are something that you will patch, I take it.

Let's make the libraries private (except the CMake target libcootapi, I suppose). Coot's launcher (bin/coot) in your case can be replaced by a one-liner exec $prefix/libexec/coot-bin $* or some such. The other stuff is only needed for the binary tar ball if the prefix directory is relocated.

merkys commented 3 months ago

Running the tests in python-tests requires additional data files. One can run less extensive internal tests that do not require additional data. Or that used to be the case - it seem that I need to reactivate that option.

Right, additional data files might be an issue. So less extensive internal tests would be very nice to have.

What needs to be done for the CMake part?

Not much - I just have to add it to Debian package building rules. I succeeded running CMake build manually, thus it should be trivial to add it to the rules.

Fonts are something that you will patch, I take it.

Yes.

Let's make the libraries private (except the CMake target libcootapi, I suppose). Coot's launcher (bin/coot) in your case can be replaced by a one-liner exec $prefix/libexec/coot-bin $* or some such. The other stuff is only needed for the binary tar ball if the prefix directory is relocated.

OK, thanks, this is what I will do.

merkys commented 3 months ago

Let's make the libraries private (except the CMake target libcootapi, I suppose). Coot's launcher (bin/coot) in your case can be replaced by a one-liner exec $prefix/libexec/coot-bin $* or some such. The other stuff is only needed for the binary tar ball if the prefix directory is relocated.

Am I right that libcootapi should be a public library? If so, it does not seem to be installed under CMAKE_INSTALL_PREFIX by CMake (CMakeLists.txt probably misses install()) and it does not have a soversion (version appended to soname), viz:

coot$ objdump -x debian/tmp/usr/lib/x86_64-linux-gnu/libcoot-cabuild.so | grep SONAME
  SONAME               libcoot-cabuild.so.0
coot$ objdump -x obj-x86_64-linux-gnu/libcootapi.so | grep SONAME
  SONAME               libcootapi.so
pemsley commented 3 months ago

Am I right that libcootapi should be a public library

That question might mean more to you than to me - but it does seem possible that there will be C++ application developers who want to use the C++ api of Coot. Most developers will use the Python module - like RDKit, I suppose.

install(TARGETS cootapi DESTINATION lib)

Doesn't that do the trick? I don't understand much about CMake.

It seems to me that the soname should be "1.1" - I may change the API in future - I have done so in the recent past.

merkys commented 3 months ago

Am I right that libcootapi should be a public library

That question might mean more to you than to me - but it does seem possible that there will be C++ application developers who want to use the C++ api of Coot. Most developers will use the Python module - like RDKit, I suppose.

OK, libcootapi should be a public library then.

install(TARGETS cootapi DESTINATION lib)

Doesn't that do the trick? I don't understand much about CMake.

Maybe the following is more appropriate:

install(TARGETS cootapi DESTINATION ${CMAKE_INSTALL_PREFIX}/lib)

I do not know much about CMake either, but other install commands start with ${CMAKE_INSTALL_PREFIX} prefix. I am not sure where paths without it lead.

It seems to me that the soname should be "1.1" - I may change the API in future - I have done so in the recent past.

Sure, this is usually the case. Then the following instruction needs to be added for CMake:

set_target_properties(cootapi PROPERTIES SOVERSION 1.1)
merkys commented 3 months ago

Maybe the following is more appropriate:

install(TARGETS cootapi DESTINATION ${CMAKE_INSTALL_PREFIX}/lib)

This seems to do the same as already existing code:

install(TARGETS cootapi DESTINATION lib)
pemsley commented 3 months ago

--self-test to run a few self tests (no external data needed). Returns with 0 exit status on success.

I have tweaked CMakeLists.txt in the light of the above discussion.

merkys commented 3 months ago

--self-test to run a few self tests (no external data needed). Returns with 0 exit status on success.

Thanks. I tried running this test on source of git commit 28325c44bd659fefae5b1abb5357c432769e534f (latest) and got the following output:

INFO:: Running internal self tests
 INFO:: Test Clipper core   : OK
 INFO:: Test Clipper contrib: OK
run_internal_tests() --------- we have 8 internal test functionns 
Entering test: kevin's torsion test
PASS: kevin's torsion test
Entering test: test_alt_conf_rotamers
INFO:: Reading coordinate file: /home/andrius/data/greg-data/tutorial-modern.pdb
INFO:: file /home/andrius/data/greg-data/tutorial-modern.pdb has been read.
FAIL: test_alt_conf_rotamers  found only 0 rotamers 

Tests seem to reach for /home/andrius/data/greg-data/tutorial-modern.pdb which is outside the source tree (moreover, I cannot find greg-data in the source). Thus I symlinked coot's data/ directory to ~/data/greg-data and now the required test files are found. Interestingly, despite FAIL on the last line, the exit status is 0. Maybe this is OK then?

I have tweaked CMakeLists.txt in the light of the above discussion.

Thanks, I confirm that the soversion is visible now.

pemsley commented 3 months ago

Oh, my bad. I should have checked the directory. --self-test (i.e. pure Coot install) shouldn't know about greg-data.

I don't like FAIL and zero exit status. I will investigate that too.

merkys commented 3 months ago

By the way, there is still one occurrence of swig call with CPPFLAGS which fails if CPPFLAGS contains anything unknown to swig: https://github.com/pemsley/coot/blob/de224b2f21319c9024a679be7b0887d47c004c8d/src/Makefile.am#L181

pemsley commented 3 months ago

OK, both the --self-test problems has been cleaned up https://github.com/pemsley/coot/commit/88d8739a80ce95bd7d9b449358a89858c68ee2b0

So I think that you can tick the "tests" box.

pemsley commented 3 months ago

By the way, there is still one occurrence of swig call with CPPFLAGS which fails if CPPFLAGS contains anything unknown to swig:

Ah yes. Python too. Fixed in 12401d584737b517adab5e40618f8bfe4f58350b

pemsley commented 3 months ago

Do you need anything more from me re debian/copyright?

merkys commented 3 months ago

Thanks, tests are now working (and passing successfully). I confirm that the swig issue is as well fixed now.

As for debian/copyright, I think copyright details are clear for all files now. For files with "all rights reserved" without explicit license I am just going to assume they fall under the GPL-3+ as the rest of the project.

Nevertheless, collecting all the copyright holders for all the files will take time. I have already greatly simplified the debian/copyright by stating that these are the copyright holders for all the files:

Files: *
Copyright: 2001-2007, The University of York
 2007-2012, The University of Oxford
 2012-2016, Medical Research Council
 1999-2024, Paul Emsley
 2004-2011, Bernhard Lohkamp
License: GPL-3+

It would further simplify debian/copyright a lot if I could put Kevin Cowtan and Kevin Keating here as well instead of cherry-picking their files separately. Please let me know if you agree or disagree with such treatment. Alternatively I would probably need to devise a script to look at all GPL-3+-licensed files and collate their holders and years. Current tools like decopy and licensecheck somewhy fail extracting copyright info from many files.

pemsley commented 3 months ago

It seems that I unintentionally missed some. I will fix buccaneer_ml_growing and a few other files from Kevin tonight.

Can you have a look at the ligand/dMSFT* files and see if they are OK?

pemsley commented 3 months ago

Kevin Cowtan and Kevin Keating should have cherry-picked files. I can make you a list for both.

merkys commented 3 months ago

It seems that I unintentionally missed some. I will fix buccaneer_ml_growing and a few other files from Kevin tonight.

Thanks.

Can you have a look at the ligand/dMSFT* files and see if they are OK?

Yes, they are licensed under BSD-3-Clause which is perfectly fine.

Kevin Cowtan and Kevin Keating should have cherry-picked files. I can make you a list for both.

Thanks. It would be nice to have an automated procedure for such cherry-picking in order to be able to update with every new coot release. Perhaps it would suffice to look for their surnames with something like grep -P 'Copyright.*Surname' and pass to awk/perl script to collate years? I think I could write something usable in a couple of minutes.

By the way, what about Bernhard Lohkamp? I have taken the liberty to add him under Files: * as well, but maybe his files should be cherry-picked as well?

merkys commented 3 months ago

For example, the following extracts all explicitly GPL-3+ versioned files:

find . -type f -print0 | xargs -0 grep -l 'GNU General Public License' | xargs grep -l 'either version 3 of the License' | xargs grep Cowtan | grep -i copyright

This extracts files without explicit GPL license:

find . -name debian -prune -o -type f -print0 | xargs -0 grep -l Cowtan | xargs grep -L 'GNU General Public License' | xargs grep Cowtan | grep -i copyright

After that, extracting year range and outputting in debian/copyright format should be easy.

pemsley commented 3 months ago

I think the current debian/copyright is more accurate and I prefer it.

Now that I understand a bit more about what it is, I would rather modify that than prune it back as you suggest. I will spend some time in the next few days on fixing the notices and debian/copyright - maybe tomorrow, but maybe not.

By the way, what about Bernhard Lohkamp? maybe his files should be cherry-picked as well?

Yes, I think so.

merkys commented 3 months ago

Thank you for your support in this. Writing a debian/copyright is quite often the most tedious part of creating a Debian package. I strive to make them both accurate and maintainable.

Accuracy matters a lot, as package may not get into Debian because of missing license statements. In Debian, copyright review is done by a dedicated team. Sometimes a package has to wait months for its review. Then if its debian/copyright is inaccurate, it is "returned for repairs" and the process is repeated (including the waiting). I strive to get the debian/copyright right for its first review, but my success rate is around 2:3.

By maintainability I mean keeping the debian/copyright accurate for subsequent package updates. Subsequent package updates in Debian do not undergo copyright reviews (unless new binary packages are introduced, renamed, or soversions get bumped), but having outdated debian/copyright kind of defeats its purpose (and is a serious bug if noticed). I maintain a couple of Debian packages which I cannot keep up-to-date due to the complexity of their debian/copyright files.

Thus it would be nice if we could devise an algorithm to collate the files and their copyright holders for coot. As said, current tools produce both false-positives and false-negatives. It is nice that in coot source files are properly annotated. Too bad that current automated tools miss some of them - perhaps their regular expressions fail.

merkys commented 3 months ago

I think I am going to give licensecheck another shot. I will exclude all non-text files (images and sounds), fixup scheme files with sed (licensecheck does not understand scheme's ;;;; as comment line) and run licensecheck. The output will still need some manual work (like removing all UNKNOWN files which fall under top-level copyright and putting BSD-3-Clause for ligand/dSFMT* files), but I expect that that will require less work (and be more accurate) than my current approach of minimising the length of debian/copyright. Then I will add raw licensecheck output in order to diff it against licensecheck output for the subsequent coot releases. What do you think?

pemsley commented 3 months ago

What are UNKNOWN files?

Which files do not need an embedded license? autogen.sh burn-up/make-new-graph.sh burn-up/new-burn-point.sh burn-up/process-rel-todo.awk blog/_config.yml .clang-format ChangeLog aux-scripts/mtrix-to-ncs-matrix.awk

merkys commented 3 months ago

What are UNKNOWN files?

By UNKNOWN files I mean the ones from which licensecheck cannot extract any licensing info (no copyright holder and license indicators). Normally files without any indications are treated as falling under the top-level license appearing in COPYING (or similarly named) file.

Which files do not need an embedded license?

I think none of them needs an embedded license iff COPYING is applicable to them.

merkys commented 3 months ago

I think I am very close to getting to debian/copyright right. As said, I used licensecheck to prepare a draft for debian/copyright and filled in the missing licenses without merging any of the paragraphs. You may find my attempt here (please mind that it is on a branch). The file is quite long now, but I think it has a great chance to pass the copyright review, as all licenses and copyright holders are explicitly listed for every file.

I still need some time to finalise this debian/copyright, but I expect the remaining work to take less than an hour. After finishing debian/copyright I am going to upload the package for the copyright review. As the review may take a few weeks, this will give us time to polish the package itself.

pemsley commented 2 months ago

I recently read the copyright notices of several files. I was surprised that they were GPLed. I had intended for them to be LGPL3+. I thought I had converted the licenses years ago. I wll have to go back and convert MoleleculesToTriangles cootilus, cootaneer and protein_db also. Baah.

I will write you here when those files have been changed.

The *-kk.cc files will remain GPL, they are not mine to change.

I have removed a few unused files, you may have noticed.

pemsley commented 2 months ago

I notice that other software package copyright notices have something like This file is part of GnuTLS or This file is part of the GNU MP Library. or This file is part of GNU Nettle.

after the copyright line. Is this line needed?

merkys commented 2 months ago

I notice that other software package copyright notices have something like This file is part of GnuTLS or This file is part of the GNU MP Library. or This file is part of GNU Nettle.

after the copyright line. Is this line needed?

This does not affect the treatment of copyright notices (i.e., it is not needed by Debian). However, developers of code sometimes choose to indicate the name of the project in each/some code files. I guess this is done mostly to retain a pointer to the original codebase in case their files get copied to other people's projects.

alexmyczko commented 2 months ago

@merkys if there is anything i can help with?

merkys commented 2 months ago

@alexmyczko at the moment I am waiting for @pemsley to finish checking the copyright notices. After that the package should be ready for its initial upload, and while we wait for the copyright review we can finish polishing it.

I plan to work on these tasks myself as I have already started them:

Here are some tasks that you may help with:

pemsley commented 2 months ago

Yes. Sorry for the delay. The days are just packed. I've been working on Moorhen too and I just haven't had the hours to sit down and fix this. I will give it a bash in the next few days (if I can get work done on my laptop).

I bungled the previous change - I meant to use the LGPL.

alexmyczko commented 2 months ago

i can take 1, 2, and 4.

pemsley commented 2 months ago

Morten Kjeldgaard is no longer a friend of the project - see his twitter comment for an example. So, I agree.

pemsley commented 2 months ago

Re: non-amd64 bit: I have an as-yet-unboxed Raspberry Pi somewhere - I could try that. I expect the problem to be with the dependencies, e.g. RDKit, GTK4, PyGObject, clipper, gc - I don't see why the actually Coot part shouldn't just work.

alexmyczko commented 2 months ago

oh that is slow, i have http://bananas.debian.net

pemsley commented 2 months ago

OK, you can try the generic build script and see what happens...

get https://raw.githubusercontent.com/pemsley/coot/main/build-it-3-3

$ bash build-it-3-3

merkys commented 2 months ago

Thanks @alexmyczko for taking the tasks! Please mind that I worked in package-latest-dev-version branch as I would like to reserve master for coot releases at the moment.

Re coots-nest. I will take care of this and will exclude it from Debian package.

Re non-amd64. Debian supports a variety of architectures and machines with these architectures are made available for Debian developers to help with porting. But as said, this can wait until after coot gets accepted into Debian, as builds on all Debian architectures will be attempted automatically after that. Of course it does not hurt to try earlier.

pemsley commented 2 months ago

cootilus/nautilus_lib.pdb is a PDB file, not source code. How does the license work for that (there are other such files, for example cootaneer/cootaneer-llk-2.40.dat, protein_db/protein.db).

Also cootaneer/Makefile-coot - that also doesn't have a copyright notice. It is not used, I think. Should I remove it?

pemsley commented 2 months ago

I have added or made updates to many source code copyright notices.

My plan is to continue to do so add or change the others to LGPL 3.

pemsley commented 2 months ago

This is slow-going, painstaking and dull.