Put SourceMods in git, and/or store SourceMods as diffs

ESMCI / cime

Common Infrastructure for Modeling the Earth

http://esmci.github.io/cime

Other

159 stars 205 forks source link

Put SourceMods in git, and/or store SourceMods as diffs #3298

Open jedwards4b opened 4 years ago

jedwards4b commented 4 years ago

The issue #3295 illustrates the need to put SourceMod files under git.
The user had SourceMods developed in tag cime5.6.10_cesm2_1_rel_06 of cime. The user wanted a feature developed in a newer version of cime and was advised to update cime to the latest maint-5.6 branch. The user copied the old SourceMod file to the updated version of cime and thus lost a critical namelist initialization routine introduced in a later revision of maint-5.6. The result was that the diffuse albedo variable in shr_flux_mct were not initialized
correctly.

I believe that we can build a git based SourceMods mechanism that would have allowed the user to correctly merge her SourceMods file cime_comp_mod.F90 into the newer version in the maint-5.6 branch.

billsacks commented 4 years ago

Thanks for starting this issue, @jedwards4b . I'd be interested to hear more about how you imagine your idea being implemented and used, either in discussion in this issue or in person.

I was just about to suggest a different idea that achieves the same goal. I'll describe my idea here, then will be interested to have more discussion on these different possibilities, first with you and some other SEs, then also with some scientists / users to see what they think about these ideas.

Summary of my proposal

Do not allow full source files in SourceMods. Instead, only allow diff/patch files in SourceMods (as can be generated with diff -u). The build scripts would then apply the diffs to the applicable file at build time, dying with an error if the diff doesn't apply cleanly.

This proposal would force users to store their SourceMods in a format that would be much less likely to cause problems when they update their code version. It puts some required extra work up front for users, but not much (running diff -u to make a diff file after finishing making their changes), and in return gives much more safety in the use of SourceMods. The feature I especially like is: Once the user has their SourceMods in the correct diff format, this doesn't rely on them remembering to do anything more if they update their code base, transfer their SourceMods to a different case, etc.

Details from the user's point of view

Typically, a user will begin by making a copy of the file they want to change, and then will make the necessary changes directly to that file. This will be the same in the new scheme as it always has been. However, now they will NOT put that file directly into their SourceMods directory. (We could provide a standard directory for them to put this file in, or it could just be up to them where they want to store it. However, it's important to note that, once they're done making changes and have generated a diff file from these changes, this source file is no longer important, and can safely be removed.)

Next the user will generate a diff/patch file using the standard unix diff utility, and will put this in the standard SourceMods location: diff -u /path/to/original/file.F90 /path/to/modified/file.F90 > SourceMods/src.clm/file.F90.patch (I'd suggest supporting both .diff and .patch extensions.)

There is nothing more the user will need to do; the build scripts will handle the application of this patch file. If the user updates the code base used for this case (and if this results in a change in one of the patched files) the patch will automatically be reapplied upon the next rebuild. If the user wants to move these SourceMods to a different case (which seems to be a common need), they will move just the diff file.

Details of implementation

I believe that this can be implemented via a few small changes to each component's buildlib script plus a new function in CIME.

Changes in each component's buildlib script

Currently, component buildlib scripts list a set of paths like this:

            paths = [os.path.join(caseroot,"SourceMods","src.clm"),
                     os.path.join(lnd_root,"src","main"),
                     ...

The first element of these would be changed to instead point to a new location where the generated (patched) files will be put. This could be a hidden directory inside SourceMods like SourceMods/src.clm/.generated (which would be created by the CIME function), or it could be a directory in the bld path. (The advantage of putting it in SourceMods would be that it would be easier for the user to find and view the generated file. However, that's also a possible disadvantage, in that it could lead knowledgeable users to put files directly in the .generated directory, bypassing the diff/patch mechanism.)

In addition, a single line would be added to each component's buildlib script that calls a new CIME function that generates all out-of-date files from the patch files and the original source file. (This call would happen sometime before the main call to make.)

(Other than that, no changes should be needed in the components' buildlib scripts, or in CIME's Makefile: As far as the main build process is concerned, things are working exactly the same as before; it just happens that we have a different location for where to find modified source files, rather than directly in directories like SourceMods/src.clm.)

Details of new CIME function

I imagine a new function like:

def generate_patched_files(path_to_sourcemods, path_to_generated_files, sourcefile_paths):
    """
    Arguments:
       path_to_sourcemods: string, giving path to this component's SourceMods directory
       path_to_generated_files: string, giving path to directory where generated files
          should be placed
       sourcefile_paths: list of strings, giving paths to each directory in which source
          files can be found
    """

Each component will call this function from their buildlib script. Note that sourcefile_paths is already needed by components in order to create their Filepath file, so I believe that this puts minimal additional burden on components.

This function looks for all files in path_to_sourcemods ending with .diff or .patch. Let's say it finds a file named FILENAME.F90.diff. It then searches the directories given by sourcefile_paths until it finds a file named FILENAME.F90. It checks the mod times of FILENAME.F90.diff, FILENAME.F90 and (if it exists) the generated file path_to_generated_file/FILENAME.F90; if the latter is out of date, it is regenerated.

Generation of the patched file would be done using the unix patch utility. (Alternatively we could snapshot the single .py file implementing patching from https://pypi.org/project/patch/ into CIME, if that seems robust enough and we want to avoid relying on an external unix command.) If the patch doesn't apply cleanly, an error is generated, forcing the user to recreate the .diff file.

In addition, this CIME function will look for any files in path_to_sourcemods with a .F90 extension. If any such files are found, an error will be generated, with detailed instructions about how to use this new functionality (or a pointer to the part of the cime documentation where we will document this functionality). This check is particularly important in the near-term, in order to retrain users to use the new mechanism. Note that, other than this error check, .F90 files and other source files in SourceMods will be ignored: only .diff/.patch files will be handled.

Alternative implementations considered and tentatively rejected

I considered trying to build the logic for generating a patched file into the main Makefile. The advantage of this is that individual components' buildlib scripts wouldn't need as many changes (and it could possibly be done in a way that they wouldn't require any changes). However, I tentatively rejected this for a few reasons:

I struggled with how to write the makefile dependency rules for this
Even if we could come up with a robust way to write the makefile dependency rules for this, I began to feel that this would be more fragile
Components that don't use cime's standard Makefile (in CESM this at least includes CISM) would need a different implementation

Note that if this functionality is incorporated into the main make process, it's important to do this patching before calling mkDepends, because it will often be the case that SourceMods will change the dependency structure.

jedwards4b commented 4 years ago

Thanks Bill this is an intriguing idea. I think that if you modify it a little it would allow the current SourceMods methods to continue to work while providing the new method as well.

The thing I don't like about it is that it doesn't track changes in the patch file - if a user modifies a patch mid-run the original modification is potentially lost.

Here is another idea - how about if create_newcase created a local git repository of the entire case directory. This repository would be updated each time the case changed and the user would have the option to push it to a remote repo for posterity. A new utility would be introduced create_sourcemod which would be given a path to the file to be changed in the source tree . This allows the SourceMod file to track its original counterpart and we could implement tools to automatically merge to an updated source file. This method would also work to replace the current archive_metadata function and would allow a user to create their own testdb like functionality - especially useful for users who don't have or want access to our proprietary testdb. This functionality could be implemented as an optional argument to create_newcase retaining all of the current functionality - this would allow us and our more intrepid users to develop and mature the new software before potentially making it the default method.

rljacob commented 4 years ago

If you need to make some mods to the source, just create a git branch. Would the user's problem have been caught if they were trying to rebase their branch to a more recent maint-5.6?

billsacks commented 4 years ago

@rljacob you seem to be arguing against using SourceMods at all. There are some on the CESM side who would agree with you, but many others (myself included) feel that SourceMods provide useful functionality in some situations that we need to maintain in some form.

@jedwards4b I really like your ideas about adding some version control mechanisms for the case. This seems helpful regardless of what we do for SourceMods. My proposal could be combined with that idea by either putting the .diff file under version control or putting the generated file under version control (with the latter, the build script could make an automatic commit when the generated file is updated).

You're right that it would be easy to combine my proposal with current-style SourceMods, supporting both side-by-side. I feel that – at least long-term – we should stop supporting the current style, because they are way too error-prone. I could see supporting both side-by-side for some interim period, though – that's a good idea.

Let me explain a bit about my motivations, because I think that helps explain why I came up with this specific proposal:

One of the big problems I was trying to solve with my proposal was the common occurrence that SourceMods developed in one case later take on a life of their own: They are copied to additional cases of the same user, and even passed from one user to another, and are then used in many cases, with a variety of different code bases. My sense is that users generally do whatever is easiest. So a goal of my proposal was to make it easy for users to do the right thing in this situation (apply just the minimal set of diffs when moving SourceMods around), and much harder to do the wrong thing (they would need to dig into where the generated sources are kept, and copy those to some hidden directory in the new case that doesn't even exist prior to the build, rather than just copying the more visible diff file).
Another goal of my proposal was to avoid introducing more CIME-specific tools that users need to learn. Given the importance of the correct application of SourceMods, I was worried about a solution that relies on users knowing that they need to run CIME-specific tool X – and both knowing how to use that tool correctly and then actually remembering to use it when the time comes.
Yet another goal was to come up with something that could probably be implemented with a few days of work.

I'm having trouble picturing how the git repo-based solution, on its own, addresses (1) - or, if it does, how it prevents users from taking the easier approach of just doing a cp /path/to/oldcase/SourceMods/src.clm/filename.F90 /path/to/newcase/SourceMods/src.clm/filename.F90. As long as it remains easier for users to do that than to use the tooling we develop, I feel we're going to be fighting an uphill battle to get people to do the right thing.

But I'm interested to get a better understanding of what you have in mind. I'm especially interested in your thoughts on how to keep the case repository connected with the various source code repositories. Given that they don't share history, I could imagine somehow using git subtree operations to help with this, but I can't quite picture how that would work (which is probably due to my relative lack of experience with subtrees). I'm also having trouble picturing what a user would need to do when moving SourceMods from one case to another under this idea.

I think there are a lot of good ideas in this discussion and I look forward to talking more with you about this.

gold2718 commented 4 years ago

My feeling is that both ideas are a lot of technology chasing what is really a user education issue. I have a lot of questions of how this would play out. I will start with two.

IMHO, one of the major strengths of SourceMods is that it provides a mechanism to store a case as it was run: 'I ran the tax X case with these changes'. This functionality was key to developing CESM2 through hundreds of experiments. SourceMods helps the case directory serve as an experiment notebook which is a very important piece of scientific equipment. If you put any sort of version control into SourceMods, how will you tie specific runs to specific SourceMods versions?
I cannot even tell you the number of times I have had a diff patch fail because of some change in the source file. How will you keep track of the original file so that the user can recover from this situation? My experience is that many users have great difficulty dealing with patches -- is this really going to solve the problem without creating new ones?

billsacks commented 4 years ago

@gold2718 I want to reply to some of your points, particularly with respect to the patch-based solution:

both ideas are a lot of technology

I believe the patch-based solution could be implemented in 50 lines of code or less in CIME (somewhat more if you count extensive inline documentation and the text of extensive and descriptive error messages as code), plus one additional line of code in each component's buildlib file.

really a user education issue

For education, I'd suggest to users doing exactly what I proposed with the diff-based solution – but needing to do more of the steps themselves. My proposal was born out of wanting to make this easier on users: it's less work for us to implement a patch-applier in SourceMods than to constantly explain to users how to do this themselves, particularly in light of your good point that "many users have great difficulty dealing with patches". Also, it's hard to imagine that we'd be able to reach a majority of users with education alone, unless we have a way to prevent doing things the wrong way. We have enough trouble educating users within CGD about things like this, let alone external users.

I cannot even tell you the number of times I have had a diff patch fail because of some change in the source file. How will you keep track of the original file so that the user can recover from this situation? My experience is that many users have great difficulty dealing with patches -- is this really going to solve the problem without creating new ones?

I agree that there are potential problems. But I don't see a scenario where the patch proposal is worse than the current situation, and in most scenarios it seems significantly better:

Moving SourceMods to a different case using the same code base: The patch solution is no better or worse than the current situation
Moving SourceMods to a different code base, where there has been no change in the source file: The patch solution is no better or worse than the current situation
Moving SourceMods to a different code base, where there has been a change in the source file that does NOT conflict with the patch: The patch solution is significantly better than the current situation: In the current situation, a user who hasn't thought about the issues and/or hasn't tracked the original file will end up mistakenly changing other aspects of the source file; this is what happened in #3295 . With the patch solution, the patch will apply cleanly.
Moving SourceMods to a different code base, where there has been a change in the source file that DOES conflict with the patch: The patch solution, while not perfect, is still significantly better than the current situation: The current situation has the same problems as (3). The patch solution would lead to a build-time error, alerting the user to the situation. True, it might be hard to provide the user with all of the information they would like in order to recover in the most educated way, but they'd still be in a much better situation than they currently are: Currently, if they even realize there's an error, all they have to work with is the new source file and the SourceMods file. They then have to guess which diffs in the SourceMods file should be kept, and which diffs are due to advances in the source file. With the patch solution, there is much less guesswork. Even if they don't know the original file, and even if they are uncomfortable working with patches in an automated way, it will still be a lot easier for them to manually reapply the intended changes in this situation.
Moving SourceMods to a different code base, where the source file has been removed: The patch solution is still significantly better than the current situation: The current situation will likely either lead to some cryptic build errors or will lead to the SourceMods being silently ignored. The patch solution would lead to the user being clearly alerted to the problem.
Are there other common situations I'm not considering? I'd like to hear them.

A further advantage of the patch-based solution is that it makes it easy to see at a glance what the user actually changed. This is helpful both for the case being self-documenting, and for others (e.g., SEs) who are asked to look at the case. Your point about tracking the original file (or lack of such tracking) makes this all the more important, and was another motivation for my proposal: On numerous occasions, a user has shown me their SourceMods, but hasn't known the version on which they are based. That makes it impossible to see what they changed. With the patch-based solution, it would at least be easy to see what they intended to change; even if you can't get the patch to apply cleanly, you're in a much better situation than the current state of affairs.

I completely agree that this solution is not perfect. But I feel that it is significantly safer and more robust than what we currently have in place. Yes, it introduces its own problems, but they are of the "raise an error because the code can't handle the situation cleanly" nature rather than the current "silently do the wrong thing" nature. I feel that the former is vastly preferable to the latter.

billsacks commented 4 years ago

Also, while this point seems mostly aimed at @jedwards4b 's proposal, I want to take a stab at responding to it as well:

IMHO, one of the major strengths of SourceMods is that it provides a mechanism to store a case as it was run: 'I ran the tax X case with these changes'. This functionality was key to developing CESM2 through hundreds of experiments. SourceMods helps the case directory serve as an experiment notebook which is a very important piece of scientific equipment. If you put any sort of version control into SourceMods, how will you tie specific runs to specific SourceMods versions?

I again don't see how the proposal makes things any worse than the current situation, and it again improves things in many ways. We shouldn't reject new proposals because they aren't perfect, but rather should judge them based on whether they improve things significantly from the current situation.

I think you're arguing that a case directory should remain unchanged once the case has started running. That is a very sensible idea, and in that case, it doesn't matter very much whether or not the case is in version control. I believe @jedwards4b 's solution was meant to address the common situation where this advice is not followed – sometimes out of necessity like tweaking some parameter mid-run to get past an instability (that particular example likely isn't a SourceMod, I know).

As to how to tie specific runs to specific SourceMod versions, I believe this would be fairly easy: We could do something like have the driver namelist generation query the current hash of the case directory, adding that hash to the driver's namelist. Then this could be output in the cpl.log file.

jedwards4b commented 4 years ago

@gold2718 If the case is under version control every change in the case is recorded including what SourceMods are used in which run. I don't see how my proposal would do anything but strengthen the concept of the case directory as a scientists notebook. The detail of how to tie files in the SourceMods directory to their counterparts in the source tree is something yet to be worked out, but in my proposal the structure of the SourceMods directory doesn't change at all and the scientist can choose to use these new tools or not. I think that the will sell themselves and eventually become the default.

gold2718 commented 4 years ago

@billsacks & @jedwards4b,

Below is an attempt to continue the conversation while snipping a lot of material I am not responding to (this time).

I believe the patch-based solution could be implemented in 50 lines of code or less in CIME

No problem with your definition of LoC but you left out users needing to learn new techniques to create and deal with patch files. CIME insulates users from lots of technical details to ease their workflow (config_machines.xml and config_compilers.xml alone have saved my productivity and my sanity) so I think we should think carefully when we ask them to learn a new skill and provide some sort of support.

Moving SourceMods to a different case using the same code base: The patch solution is no better or worse than the current situation

This assertion leaves out the the users has to do more work for each SourceMod file. We should keep in mind this extra step and possibly try to mitigate it when working towards a solution.

Even if they don't know the original file, and even if they are uncomfortable working with patches in an automated way, it will still be a lot easier for them to manually reapply the intended changes in this situation.

This is a bold assertion for which I would like to see evidence. I have taught users to use a three-way merge which is what is required in this situation (scenario 4 from your list). Have you ever done your method (no original source file) with the typical grad student or postdoc?

A further advantage of the patch-based solution is that it makes it easy to see at a glance what the user actually changed.

Except in the simple cases, this is only strictly true if we know the original source file.

One additional point. One use of SourceMods is to compare case directories to see what was different. In the case where diffs are stored in SourceMods, users will be comparing a diff of diff files which can be an interesting experience (not too bad if you learn to do it properly but another new skill we are foisting on users).

I am going to follow up with an attempt to describe the problems and solution requirements.

jedwards4b commented 4 years ago

I think that we can tie SourceMods to source code in git using the https://www.pygit2.org/index.html API and a deep dive into understanding git. @billsacks solution has the advantage that we can implement it fairly quickly but I think that it's only a partial solution and it ties us to some arcane code for handelling diffs that is already implemented in the git api.

gold2718 commented 4 years ago

Does everyone agree with these two statements?

SourceMods provides a method for running a case with documented modifications from the model checkout. However, the CIME case control system currently provides no documentation linking a SourceMod file either to the original source code or to the run(s) made with that modification.
Any proposed solution should solve both of these problems without placing undue burdens on SourceMod users. To be clear, what needs to be documented is the difference between:
- The version of the code used to configure, compile, and customize (namelist) a run
- The model checkout version (not the SRCROOT sandbox).

If you have an issue, please respond with a restatement of both the problems and any solution requirements. Note that my statements are not meant to directly weigh in on the code vs. diff question.

gold2718 commented 4 years ago

In thinking about this, I cannot think of any way to do this without some sort of case control (i.e., something along the lines of @jedwards4b proposal). Perhaps the necessary information (or the entire case directory in @jedwards4b's proposal) should be captured at case.submit time with a commit message that includes the run information (RUNDIR and/or LID?) and all the model commit hashes from SRCDIR. Again, this is not a statement about the value of diffs vs complete files -- I still have not seen (or thought of) anything that looks like a complete solution to my problem statement with either version. One thing I worry about with the proposed solutions is the issue of users copying SourceMods from one case to a new one where the model checkout is different. I think what would make me feel better (which I assume is everyone's top priority) is a diff format that includes a pointer back to the original file (e.g., repo URL and hash). Of course, this implies more tools etc.

jedwards4b commented 4 years ago

All of the information that should be carried with a SourceMod file is already available in the repos .git directory - there must be a way we can attach that information to the file so that it is maintained if the user copies it from one case to another (even if that new case is on a different machine).

billsacks commented 4 years ago

Even if they don't know the original file, and even if they are uncomfortable working with patches in an automated way, it will still be a lot easier for them to manually reapply the intended changes in this situation.

This is a bold assertion for which I would like to see evidence. I have taught users to use a three-way merge which is what is required in this situation (scenario 4 from your list). Have you ever done your method (no original source file) with the typical grad student or postdoc?

I agree that just storing diffs is still not ideal. My point was just that it's a lot better than the current situation, where you can't even reliably determine the diffs. I'm tired of trying to help users migrate SourceMods into a more recent code base when all they have are the SourceMods F90 files themselves.

Regarding the recent comments: I'd be very happy with a solution that attaches the hash of the original code to each SourceMod file, as long as this is done in a way that makes it relatively fool-proof for this information to be carried along with this file as it gets passed around from case to case and user to user.

gold2718 commented 4 years ago

Regarding the recent comments: I'd be very happy with a solution that attaches the hash of the original code to each SourceMod file, as long as this is done in a way that makes it relatively fool-proof for this information to be carried along with this file as it gets passed around from case to case and user to user.

This is my biggest concern with solutions that involve a repository as part of SourceMods. If a user copies diffs or modified files without copying the underlying repo, we lose the paper trail. Given my earlier statements about the necessity of a repo, this puts me in a bind. Storing enough information in each SoureMods file might provide a solution -- the repo is only for storing the state of SourceMods for each run.

I have a concern with the SourceMods repo also storing basic case information. I think we need to separate them because we do want a way to copy SourceMod information from an existing case to a new one. If this includes copying a SourceMod repo, we do not want that overwriting anything the new case is using. If everything is in one repo, we must make sure each SourceMods file has enough information to stand on its own.

Noodling around -- this is an idea of what could happen if the files in SourceMods are some sort of patch file with a pointer to the original.

Check the current SRCDIR commit version against the information in the patch file. If no change, we just need to proceed using the current patch file (i.e., skip steps 2 & 3).
Attempt to apply the patch to the current SRCDIR commit version (may need to grab the correct file if (or in case) the SRCDIR version is modified). Patch application failure leads to an error with instructions for performing a three-way merge.
Apply the patch and then create a new patch file including the updated repo version information.

billsacks commented 4 years ago

I conducted a USUI this morning (UCAR Shuttle User Interview) with Justin Small, running by him the idea of storing SourceMods as diffs. His initial response was one of skepticism ("what would be the point of that?"), but once I explained two points (making it less error-prone to move SourceMods to a different case that uses a different set of sources; and making it easier to get old SourceMods onto a branch in order to integrate them into master) he nodded in agreement and said he thought this seemed like a good idea.

@Katetc was also part of this conversation, but I'll let her chime in with any thoughts she has herself rather than trying to speak for her.

billsacks commented 4 years ago

I brought up the idea of storing SourceMods as diffs in a telecon with members of CESM's Land Ice Working Group. Bill Lipscomb was supportive of making this change, and nobody objected to it.

ekluzek commented 4 years ago

I was on the telecon with Bill as well, so let me correct @billsacks to point out that there was strong support for it. In part of the telecon it was brought up that the SourceMods problem had resulted in an expensive error that required rerunning a simulation. A SourceMods file from a different version was put into a simulation with an update that ended up removing a critical piece of code. So I'd say it's more than "nobody objected to it" it was "this would have prevented an expensive error that caused us to rerun a simulation". Rerunning the simulation means they are behind in the analysis they wanted to do, and where they wanted to be at this point. The problem was brought up independently from Bill talking about it, but it enabled the discussion to have a descriptor for what the problem was. And it also enabled having a proposed solution for that specific issue for the future.

jedwards4b commented 4 years ago

In the interest of correctness going forward please note that the problem was not an mpi broadcast, but rather a namelist initialization and read that didn't take place.

Katetc commented 4 years ago

Ok, I just read through this whole thread. And I was on the LIWG call and on the shuttle the other day with Bill. Here's a few of my thoughts right now:

SourceMods provides a method for running a case with documented modifications from the model checkout. However, the CIME case control system currently provides no documentation linking a SourceMod file either to the original source code or to the run(s) made with that modification.

Any proposed solution should solve both of these problems without placing undue burdens on SourceMod users. To be clear, what needs to be documented is the difference between: The version of the code used to configure, compile, and customize (namelist) a run The model checkout version (not the SRCROOT sandbox).

I'm glad that Steve is clearly defining the problems here, but I disagree that any proposed solution has to solve all of them right now. I think it is pretty clear from this discussion that a solution that would solve both of these problems will likely require a radical change in user workflow and a fair or a lot of work on the part of some software engineers. And this may or may not be the time to do all of that. I think it is very valid to just start making progress, or make some changes that address part of the problem, without needing to fix the whole issue at once.

I am honestly leaning towards the side of getting rid of SourceMods all together, and just having users develop within Git branches in the long run. I think it would be much easier to tie the list of Externals and/or a git diff of the Source (for un-checked in changes) to a run and perhaps store that in with the logs for that run. Maybe as a separate SourceMods.log . I feel that source should stay in the source tree for simplicity and source control. Maybe I'm old, but during my postdoc working with CLUBB and CAM, we used only SVN branches for our development. I had not used a sourcemod at all until about 3 years ago. I am continuing to use this development methodology in microphysics as well. We do not use sourcemods in micro_mg development.

However, I think it would be SUPER TACKY to just get rid of all sourcemods tomorrow. I think we need to get most of our userbase well versed and comfortable in Git before we move to that workflow. I'm thinking 2-3 years out for that change.

So, I think using the Diff/patch method is by far the easiest and least error-prone suggestion that can be implemented immediately. And I think it should be implemented relatively soon. It would be good to talk about at the co-chairs meeting, and try to send out announcements on either the CESM bulliten boards or email lists (or both). When people are aware that a change like this is coming their way, and that they have some time to prepare for it, it really helps the transition process.

Katetc commented 4 years ago

Oh and a quick PS, I could see encouraging these issues of workflow to be address in the CESM Workshop. Perhaps, the hands-on sessions could include making a source diff and using that as a mod. But also making a git branch and using that as your mod.

Really, if people get used to using diffs instead of sourcemods, it's kind of a baby step towards getting rid of sourcemods all together, and just dealing with changes (and patches) in Git.

billsacks commented 4 years ago

@Katetc thanks for sharing your thoughts. I agree that we shouldn't feel we need to solve the whole problem at once here. On the other hand, @mvertens pointed out to me that it would be a good idea to make ensure we've come to a reasonably firm decision on any changes that impact user workflow before putting them in place: it would be uncool to retrain everyone to use diffs for SourceMods, only to change back to using regular source files a few months later. I feel that a good compromise is that we should do enough research up-front to decide on any changes we want to make to the user experience of SourceMods, then we can approach the actual implementation incrementally to the extent possible.

I disagree about the long-term vision of doing away with SourceMods. I don't ever use SourceMods for my development work either, but I do think they have an appropriate time and place. On the other hand, it bothers me when I see presentations from people in CGD – such as at the annual CESM tutorial – that include strong language like, "Never change source code in place; always use SourceMods". Here are my rules of thumb for when it's appropriate to use a git branch vs. SourceMods:

Use a git branch, making changes in-place, for changes that have some of these characteristics:

Long-term developments
Incremental changes towards a final solution
Changes that apply to many cases
Changes that touch many files
Any change intended to eventually come back to master

SourceMods are still okay for changes that have these characteristics:

Short-term developments
Changes that apply to just one or two cases
Changes that touch just a few files

Some final thoughts about provenance inspired by your comments: Recording provenance is relatively easy when all of the source code is available in the main repo for a given model. This gets harder when using a branch / tag that is only on someone's fork and may never come into the main repo; in this case, the owner of the fork needs to keep their fork around long-term for provenance to remain intact. Provenance seems nearly impossible if you have committed changes in a local repo that haven't been pushed anywhere. These thoughts are relevant for thinking about the best way to record how a run was done: In some ways, it may be preferable to use a documented set of SourceMods off of a version in the central repo rather than just making a local branch for your work. It feels wrong to discourage people from making branches, but if we want cime to help record provenance, I wonder if we'd need some mechanism for recording any committed diffs that aren't available publicly???

billsacks commented 4 years ago

Another advantage of storing SourceMods as diffs is that it would make things like this easier to work with and maintain: https://github.com/ESCOMP/CESM/pull/135 . Of course, one could argue that SourceMods should never have been used for a purpose like that in the first place, and I would support that argument... but if we are going to allow SourceMods in usermods directories, then I feel that storing them as diffs would be significantly better than storing them as complete files that have been ripped away from their original version control history.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

ekluzek commented 1 year ago

This really would be a good thing to have implemented. SourceMods seem to be sticking around, and this would make them more robust. But, it also doesn't seem like a high priority to actually implement.

gold2718 commented 1 year ago

This really would be a good thing to have implemented. SourceMods seem to be sticking around, and this would make them more robust. But, it also doesn't seem like a high priority to actually implement.

While SourceMods may be sticking around, I continue to find problems where SourceMods have been passed around and are being used with code that is not compatible (so all too often, the source of the problem is SourceMods). I don't see how any of the suggestions above will solve the problems I am running into "out in the field". In particular, CAM has been undergoing some badly needed refactoring (e.g., the new aerosol interface code) and even patch files would not be applicable (so they would just cause more case failures which would be hard for users to understand IMHO).

billsacks commented 1 year ago

At least with a patch file we could catch an error applying the patch and give a meaningful message like, "it looks like this patch file is from a significantly different code base so cannot be applied cleanly".

gold2718 commented 1 year ago

Maybe I've missed it but has anyone proposed a workflow for this? Users modify a copy of the source or (quite often) copy "SourceMods" from someone else or from one of their old cases. When do these patch files get applied? What if the source tree is also modified?

billsacks commented 1 year ago

Maybe I've missed it but has anyone proposed a workflow for this? Users modify a copy of the source or (quite often) copy "SourceMods" from someone else or from one of their old cases. When do these patch files get applied? What if the source tree is also modified?

See my original comment above for my proposed workflow. There was a lot of additional discussion in this issue and I'm not tied to my original suggestion, but just wanted to make sure you have seen that.

gold2718 commented 1 year ago

I guess I never understood how this workflow would work. Let's say I modify dp_coupling.F90 in CAM. If I understand correctly, at build time, the system would create a full dp_coupling.F90 from the dp_coupling.F90.diff file in SouceMods/src.cam. Is that correct?

What happens if I copy the file and modify it but do not make a patch?
What happens if I make a copy of a file like dp_coupling.F90? CAM has 5 versions of this file, which will be used to recreate the modified dp_coupling.F90? I guess if we insert generate_patched_files after Filepath is generated but before Depends is generated, that could find the correct version (assuming of course that the full partial path in the patch file is used in the file search). Of course, if the user modified the wrong file (e.g., by copying a modified dp_coupling.F90.patch from another case that used a different dycore or a physpkg.F90.diff from a case that used a different version of physics), I can imagine how fun it would be to figure out what happened since the compiler error will now show up in a file generated by the build system, not a user-written SouceMods file.

Still not a fan.

billsacks commented 1 year ago

If I understand correctly, at build time, the system would create a full dp_coupling.F90 from the dp_coupling.F90.diff file in SouceMods/src.cam. Is that correct?

Yes, that was my proposal.

What happens if I copy the file and modify it but do not make a patch?

That would no longer be supported. Probably best would be to generate an error message in that case.

What happens if I make a copy of a file like dp_coupling.F90? CAM has 5 versions of this file, which will be used to recreate the modified dp_coupling.F90?

To be honest, I don't think I considered this scenario. This feels like something that wouldn't be handled well with the current SourceMods mechanism either, right? I can see how a patch-based mechanism would make things more confusing in this scenario. Is this a common scenario for CAM?

Still not a fan.

Fair enough. And I would by no means suggest that my earlier proposal is perfect. I personally feel like it's a step in the right direction, solving (significantly?) more problems than it creates, but I'm really not tied to it if others disagree and/or see a better way forward. As I said in an earlier comment:

Yes, it introduces its own problems, but they are of the "raise an error because the code can't handle the situation cleanly" nature rather than the current "silently do the wrong thing" nature. I feel that the former is vastly preferable to the latter.

But, having just skimmed back through many of the comments in this issue, I think there are some other good alternative or complementary ideas relating to putting SourceMods under version control and storing more metadata with the SourceMods. My intent in my recent posts wasn't to push strongly for the patch-based solution, but just to respond to some of the points / questions raised in the last few days.

gold2718 commented 1 year ago

Is this a common scenario for CAM?

Yes. It works fine as long as the file is for the correct dycore (or physics suite) and is not out of date. What I see is that problems occur when an incompatible file is copied in. That is a problem for all SourceMods.

That would no longer be supported. Probably best would be to generate an error message in that case.

Maybe a dramatic shift in the workflow for SourceMods is a good thing. Every proposed workflow I have seen is harder for the user. Maybe if we make it hard enough, people will stop using them.

Back when the model was under Subversion, SourceMods fulfilled a real need as most users did not have access to creating their own branch. Now, anyone can make a branch and store it in an accessible repo (e.g., a GitHub fork). I do not see any case where SouceMods will provide equivalent or better experiment documentation than a tag with good Externals files. Can someone provide a counterexample?

That would no longer be supported. Probably best would be to generate an error message in that case.

If we are going to require the users to learn a new method of modifying the code, why not just say modified files must be in a git branch and provide a pointer to documentation on how to do it?

klindsay28 commented 1 year ago

When I'm helping a colleague debug a run that isn't behaving, I regularly clone their case, copy some files to SourceMods in my clone, add write statements, and run. This avoids me needing to create a copy of their sandbox, which might have branches checked out besides what is in Externals, or have modified files that are not committed in any repo. I don't see how to easily accomplish what this workflow accomplishes without SourceMods. I'm happy to learn a different way to do this though.

billsacks commented 1 year ago

I don't want to take on the role of defender of SourceMods, but talking to some users in the past convinced me of the value of the SourceMods mechanism, such as for being able to quickly test cases with small variations on some base code configuration. Creating a git branch is lightweight, but there are potential new issues that would arise if we did away with SourceMods, particularly the need to have a separate sandbox for each case that would stick around unchanged for the duration of the case. That said, I can see some argument that dealing with the downsides of requiring git branches could be preferable to dealing with the downsides of other proposed mechanisms for maintaining SourceMods.

gold2718 commented 1 year ago

When I'm helping a colleague debug a run that isn't behaving, I regularly clone their case, copy some files to SourceMods in my clone, add write statements, and run. This avoids me needing to create a copy of their sandbox, which might have branches checked out besides what is in Externals, or have modified files that are not committed in any repo. I don't see how to easily accomplish what this workflow accomplishes without SourceMods.

This is a good example. In this case, I usually just ask for write permission to their sandbox so they can see (and potentially benefit) from my work. However, I also mention that this is not a careful way to develop code. The more modified code lying around, the harder it is to pin down how things went wrong. A little time on branch training makes everyone's life easier in my experience.

gold2718 commented 1 year ago

but there are potential new issues that would arise if we did away with SourceMods, particularly the need to have a separate sandbox for each case that would stick around unchanged for the duration of the case.

I know of at least one publication that points to nice tags but where the experiments were done with SourceMods so the publication is not reproducible. It is so easy to forget they are there.