janpfeifer / gonb

GoNB, a Go Notebook Kernel for Jupyter
https://github.com/janpfeifer/gonb
MIT License
631 stars 35 forks source link

Strange effect with `!* go mod edit` which may need some documentation or some better solution. #43

Closed oderwat closed 1 year ago

oderwat commented 1 year ago

I use !* go mod edit -replace git.jetbrains.space/metatexx/mxc/hog="/Users/$(id -un)/devsys/workspace/go/hog" in my first cell. But I want to do A/B testing and when I comment out this line. Thego.mod file does not "revert". I need to use an additional entry with !* go mod edit -dropreplace git.jetbrains.space/metatexx/mxc/hog to get rid of it. Could that be solved differently?

janpfeifer commented 1 year ago

hi @oderwat , sorry I didn't understand what is happening. Pls, correct me if I'm wrong: you have two cells (A/B testing), one with something like:

!* go mod edit -replace "git.jetbrains...=/Users/..."
%%
// Test something

And the second with:

!* go mod edit -dropreplace "git.jetbrains..."
%%
// Test something

And it doesn't work as intended (one executing with the replace rule and the other without it) ?

janpfeifer commented 1 year ago

(Btw, thanks for the issue reports, pls keep them coming whenever you find something)

oderwat commented 1 year ago

I try to explain how it happened: There is only one "first" cell. Consider it empty at first. The second cell has my imports. The remaining cells have the test/experiments I am working on. Then I decided that one of the imports needed to be modified. So I added the replace statement in the mod file using go mod edit to the first cell. Then I checked my tests and well, everything worked fine. While doing that, I wrote some more code, and now I want to check what happens without the modification. So I remove the go mod replace in the first cell and "run" everything starting with the first cell again. Then I wondered why my tests code still reflected what happened after resolving the problem. Therefore, I added a !* cat go.mo and found, that even if the go mod edit was commented out, it still had the replace statement in the go.mod file. I guess you build it and cache it in some way that it does not get reverted when the first cell is changing. I had to use dropreplace to get rid of it.

So my first cell looks like this now:

// setting up local modules
!* go mod edit -replace git.jetbrains.space/metatexx/mxc/hog="/Users/$(id -un)/devsys/workspace/go/hog"
//!* go mod edit -dropreplace git.jetbrains.space/metatexx/mxc/hog
!* cat go.mod

When I comment out the 2nd line after running it one, it will still show the "replace" in the go.mod file until I used the "dropreplace" line once.

Man, that is maybe still hard to understand. I made a screen recording :)

gonb-require

janpfeifer commented 1 year ago

Oh, ok. So maybe the confusion is that the go.mod file is preserved, and GoNB mostly doens't touch it directly, after creating it (except for %goworkfix). You could even go to a terminal and edit it.

Only the main.go file that is regenerated at every cell execution.

We need to preserve/cache the go.mod otherwise Go would try to fetch (at least the version) of every included library every time. So it's simpler just to preserve the go.mod. I'll add a note to the tutorial and documentation (%help) on that.

Does that make sense ?

In which case I would suggest using 2 cells for your test: one with the !*go mod edit -replace and another with the !*go mod -dropreplace. If you put whatever you are testing in a function, the second cell can have only 3 lines:

!*go mod edit -dropreplace ...
%%
YourTestFunc()

But it's just one way of doing it.

Btw, notice !* go mod edit -replace is idempotent, you can run it multiple times, if the replace is already there, it won't add more than once. Same with -dropreplace.

Or do you have another suggestion ?

oderwat commented 1 year ago

I think I understand why it does that. But that seems "bad" to me because it makes the notebook not reproducible. I share them with others, and we will get different results, depending on invisible state.

I think there needs to be an instruction that lets me force the recreation of a clean "go.mod" file, like one gets when restarting the kernel. Just to make sure that there is no hidden state. This should only run when the cell that contains it is executed. So it will not slow down anything else. I actually would do this every time the first cell is changed and document it as how to start with a clean "go.mod" file, maybe.

Perhaps, I need to explain, how I use it: The first some x cells set up the includes and some helper functions, like database or NATS cluster connections, logging configuration, global variables with tokens and keys.

Later cells use all of them for different tests and experiments.

So I usually run the "head" stuff just once and then work on one or multiple later cells. When I want to change package versions, I add the required replacements (or "go.work") and run the modified cells. Currently, I need to restart the kernel to get a "clean" new base. But this is just because of the "go.mod" file not being "cleared".

janpfeifer commented 1 year ago

You are right about the invisible state, and that being "bad"!

But truth is, that is the nature of notebooks, same for Python notebooks: if things are executed on different order, or if code executed (functions, variables defined), and later removed from the cell, it still impact later cells, but is no longer reproducible. The usual way to work with notebooks (again, same in Python as well), is in the end, if one wants it to be reproducible, restart the kernel, and run all from the start. The sad thing is that sometimes some of the steps are costly (take a long time, like in machine learning) and are not easy/cheap to rerun ... :(

From the description of your problem, it is interesting that you want to preserve part of the hidden state (the Go functions and variables defined elsewhere), but not the other part, the go.mod directives. For clearing the Go hidden state there is already the %reset special command. But nothing similar for the go.mod hidden state.

The adhoc way to reset go.mod would be:

!* rm -f go.mod ; go mod init $(basename $GONB_TMP_DIR)

If this command solves your problem, let me know, I could add a special command, let's say %gomod-reset ? Any suggestions for the name of the command ?

Or any suggestions for an ergonomic way to doing this ?

janpfeifer commented 1 year ago

Actually thinking about it, I would propose 2 changes:

  1. %reset also by default should reset go.mod.
  2. Add an optional parameter, like %reset go.mod, that will only reset go.mod but not the Go state (variables, functions, types, ...)

Wdyt ?

oderwat commented 1 year ago

Lgtm !

janpfeifer commented 1 year ago

This should do it. If you clone the branch reset (https://github.com/janpfeifer/gonb/tree/reset) you should have the functionality.

I tested it manually here, and it seems to do what was intended -- and reseting the go.mod with "%reset" was a necessary change. But I'll wait your test before creating the new release.

oderwat commented 1 year ago

It works for me too and now we can just add the %reset and have a more predictable state. Thank you!

janpfeifer commented 1 year ago

And released as v0.7.6!