RosettaCommons / tools

Tools for parsing Rosetta source code and scripts for running specific Rosetta applications.
Other
3 stars 1 forks source link

Add submodule updates to the release script. #75

Closed roccomoretti closed 5 years ago

roccomoretti commented 5 years ago

Now that we're using submodules more heavily, we really need to be releasing their contents.

This PR makes it such that the release process will release submodule contents in the standard release. The standard trimming approach is used, so any directory named 'pilot' under main/ will be removed (including those in submodules). We can add other trimming rules, if needed.

Current submodules which will be added to the public Rosetta release:

Definitely want to release:

Probably good to release:

Do we need to include these with the standard (non-PyRosetta) release (@lyskov) - we can trim these out specifically.

lyskov commented 5 years ago

Rocco a few thoughts:

Re converting to release: for current approach to work we will need to make sure that release script init submodules after initial cloning (somewhere around ~here https://github.com/RosettaCommons/main/blob/master/tests/benchmark/tests/release.py#L151 ). This is because right now we explicitly checking out any released submodules here: https://github.com/RosettaCommons/main/blob/master/tests/benchmark/tests/release.py#L108

Re 'Definitely want to release' list: are you sure about pyrosetta_scripts repository? Because my record shows that we have only green light for 'public' scripts from them?

lyskov commented 5 years ago

Re 'Do we need to include these with the standard (non-PyRosetta) release (@lyskov) - we can trim these out specifically.': - i think if we already including them with source release (at least i though so) which i think is good so our users could compile PyRosetta without extra downloads.

roccomoretti commented 5 years ago

This PR adds the submodule init to the convert_to_release.bash script, which is being called from within the convert_to_release() python function you link. One of the reasons to do this is that we want to do the submodule update prior to the pilot/ directory stripping, which happens within that bash script.

I was unaware of the specific cloning of binder and pybind11 in the release scripts. We can keep the submodules in the release, although there is now a redundancy in the update, with a difference in behavior. The version in the existing python script will checkout the repos' current version of master, rather than whatever submodule version was pinned for the version of Rosetta being released. I'd vote for matching the pinned version, myself. -- (Note that my initial thoughts were that (theoretically speaking) people are supposed to be licensing PyRosetta separately, and compiling from the standard Rosetta source distribution would allow them to bypass that. But apparently we're not concerned about that, which is fine too.)

Regarding pyrosetta_scipts, I'm not sure what you mean by "public" scripts. There doesn't seem to be any labelling of scripts in that directory as public/not-public. The general approach in Rosetta is to have things be released by default, unless they're explicitly marked not-for-public release. (Typically by putting it in a directory labelled 'pilot'.) As such, I would naively/porovocatively assume that the scripts there are ready for public release. -- Practically, though, I know that people probably haven't been thinking along those lines, so I was definitely going to prod the devel list about the issue prior to pulling the trigger on the update. I don't want to release any script that people don't want released.

But I would strongly argue that we should be releasing such scripts, and release should effectively be an opt-out process, rather than an opt-in process. If there's some other mechanism for labelling not-to-be-released files in the pyrosetta_scripts repo that works better there than the pilot/ directory approach, we can certainly update the convert_to_release.bash to use it. We can implement whatever stripping mechanism is desired, so long as there's a standard and it's followed.

(That said, I can see the argument that the pyrosetta_scripts directory is probably less useful for the command line release of Rosetta, so it might make sense to only release the directory for the PyRosetta distributions, rather than the source/command-line distributions.)

lyskov commented 5 years ago

@roccomoretti re scripts marked for public: i just looked this up and think i confuse this with rosetta-scripts-scripts repository (more on that below). I do agree with you that having ppl to opt-out from release is better strategy for us so lets assume that everything in pyrosetta-scripts repository is 'releasable' for now.

And this bring us to issue of what is releasable in rosetta-scripts-scripts repository. I see this directory (which i though i saw in PyRosetta scripts earlier) https://github.com/RosettaCommons/rosetta_scripts_scripts/tree/master/scripts and in it we see both pilot and public. I am assuming that we should only release things that in the public, thoughts?

I was unaware of the specific cloning of binder and pybind11 in the release scripts. We can keep the submodules in the release, although there is now a redundancy in the update, with a difference in behavior. The version in the existing python script will checkout the repos' current version of master, rather than whatever submodule version was pinned for the version of Rosetta being released. I'd vote for matching the pinned version, myself.

I agree that using 'pinned-version' is a better approach, so i guess lets comment lines 121 and 122 from here https://github.com/RosettaCommons/main/blob/master/tests/benchmark/tests/release.py#L121 for now?

re should we bind Binder and Pybind11 with Rosetta source release: - i think having just one package will be easier for us and for our academic users. Non academic users need to get separate licenses anyway and i doubt if splitting into multiple packages will stop rogue-non-academic-user from compiling PyRosetta. So my vote will be to keep number of packages at minimum and have only one Rosetta-source release package that contain all public data.

roccomoretti commented 5 years ago

Addresses RosettaCommons/main#3138