LFS/Serialization: necessary or not?

mmesiti commented 11 months ago

(continuation of https://github.com/mmesiti/PMFRG-copy.jl/issues/3)

Will the package manager also download the LFS objects?

Some considerations:

Maybe we can keep the artifacts in a "development" branch, but not in the main/master.
do we need those regression tests during development? If we have other tests for getXBubble! that we know are sufficient, we can just ditch that code and the necessary data.
If we like those tests but we do not like the files, we can find a separate solution for storing those. But then
- ~When running the tests, we should at least launch a warning if the regression test data is not available.~ I think the best would be to actually leave it as it is now and throw an error (if we value those regression tests).
We could do a better job at serializing the objects (and retrieving them), possibly saving space.
[X] use ~HDF5~ JLD2 instead of Serialization. Serialization is brittle, might break between Julia versions
[X] For the regression test: ~N=7~ N=5, SquareLattice, Nlen = 4
- [X] [Might fail, check - fixed #14 ]

mmesiti commented 11 months ago

Problems with CI related to LFS (https://github.com/mmesiti/PMFRG.jl/tree/CI-LFS, WIP)

mmesiti commented 11 months ago

A possibly hard problem with the current serialization/deserialization approach for the current regression tests is that since the tested function takes a Workspace as an argument, any change in one of the data structures in a Workspace (in particular, the ones related to #11) causes the regression tests to fail (as mentioned here)

NilsNiggemann commented 11 months ago

Yes I think these kinds of problems should be avoided as much as possible. The tests should better not break upon modifying internals only. I think instead of serialising the workspace type, we should check against the actual array types that it contains.

Or: What do you think of performing some simple checks involving only some elements of the returned values, i.e.

getXBubble!(workspace,...)
@test workspace.X.c[1,2,3,5] == ...

Another interesting idea: Perhaps we could implement some kind of "hashing" of the underlying arrays. Using an actual hash is probably an issue due to the same reasons. But if we simply check the sum(abs,X.a) etc, we can probably also catch essentially all problems and we will also have a bit of tolerance in case there are some changes within floating point accuracy.

mmesiti commented 11 months ago

I have been imprecise: the reason why the regression test fails now is deserialization (here https://github.com/mmesiti/PMFRG.jl/blob/MPI-ext/test/RegressionTests/PMFRG.getXBubble.jl#L5, or https://github.com/mmesiti/PMFRG.jl/blob/CI-LFS/test/RegressionTests/PMFRG.getXBubble.jl#L5).

Of course, if we write to disk only the bits that we care about instead of the whole worspace it could be much better. The attractiveness of just serializing the tuple of arguments to getXBubble! resides in the fact that it's super simple and can be done automatically. Anyway, these are the possible solutions for this:

Most changes to the data types can be trivially proven not to break the logic of the functions tested by the regression tests. So, one can do the change, re-generate the regression test data and replace it in the repository. Advantages: Relatively simple. Disadvantages: requires manual labor every time one makes a change that breaks deserialization.
Carefully craft the regression test case, saving to disk only the things that are important and then recreating all the non-interesting bits in the setup of the regression test (which is not any more "just read input and expected output from disk"). Advantages: done only once, hard to break afterwards, does not require anyone to fiddle again with Recorder.jl. Disadvantages: requires more labor to set up.
Change Recorder so that it serializes in a human-readable way (e.g., by writing code instead of a binary file). Advantages: Simple modification of the saved data structures Disadvantages: Requires some extensive work on Recorder.jl
Write (de)serialization library that can allow users to specify data migrations (totally impractical)

I think that 2 (or perhaps 3) are the most reasonable things to do. Regarding the "hashing" idea, it's interesting to test the output, but what about the input? Generating a valid, realistic input instead of recording it could also be a solution.

(TO BE CONTINUED)

mmesiti commented 11 months ago

After reverting the change to OneLoopParams in 58c8514, the de-serialization does not fail any more. Another good side effect of decoupling. But still the problem remains, although less pressing.

mmesiti commented 11 months ago

just for the record, I tried this

Change Recorder so that it serializes in a human-readable way (e.g., by writing code instead of a binary file).

which seems a good idea in general for Recorder, to record data in code that can be read and modified by a human. I have converted the data to a string using repr, and then formatted it. The process was extremely slow, requiring 2200 seconds for Nlen=4 and N=8 (15MB of text - autoformatters are not made for that), and eventually the obtained string could not be parsed because of the following error:

ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type sumElements

because the representation of the data contains things like

sumElements[1 2 3]

which apparently cannot be parsed correctly.

What I did, more or less:

data = deserialize("PMFRG.getXBubble.data")
datastr = repr(data) # this returns all data in a single line, which is not really wieldly
open("datarepr.jl",write=true) do f
    write(f,datastr)
end

using JuliaFormatter
@time format("datarepr.jl")
# 2155.313091 seconds (30.04 M allocations: 1.889 GiB, 0.03% gc time, 0.00% compilation time)
datastr_formatted = read("datarepr.jl",String)

data_again = eval(Meta.parse(datastr_formatted))
# ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type sumElements

NilsNiggemann commented 11 months ago

This seems to me a bit like reinventing the wheel. I don't think we can ever get more sophisticated with our solution that JLD2. Imo, Recorder.jl could be used more as a quick on-the-fly development tool. Maybe to generate unit tests, one can reduce the functionaliuty to using only arrays and use HDF5/CSVs?

mmesiti commented 11 months ago

I don't think we can ever get more sophisticated with our solution that JLD2.

Prompted by this comment (and other praises I've heard of JLD2 previously), I checked JLD2 and found that in addition to the benefit of not breaking between Julia versions, it can also deal with issues when datatypes change (https://juliaio.github.io/JLD2.jl/dev/advanced/#Explicit-Type-Remapping), which is exactly what I was concerned about when I wrote

Write (de)serialization library that can allow users to specify data migrations (totally impractical)

to be fair, that's exactly what you said here

Imo, Recorder.jl could be used more as a quick on-the-fly development tool.

That's exactly the idea. I can switch there from Serialization to JLD2, and most of our problems should be fixed.

Maybe to generate unit tests, one can reduce the functionaliuty to using only arrays and use HDF5/CSVs?

Saving to disk and reading only the arrays would make sense as well, but to me it looks like this could take a lot of time to study the code. getXBubble! takes as an argument the whole Workspace, so I would have to analyse what is in a workspace, what is relevant and I should track, and what is not relevant and I can fake (to create a "fake" workspace, just for testing). I thought that serializing the whole workspace was conceptually much simpler, general and much less error prone. I also assumed that passing the whole Workspace to getXBubble! can be desirable since it does not constrain further development (you just pass everything the function might need), although this tends to go against the philosophy of interface segregation which typically makes testability much simpler. Also, given that JLD2 allows to specify data migrations, we might not need to do that.

mmesiti commented 11 months ago

Using N=5 instead of N=7 to keep the size even smaller (2.2MB), since we know N=5 now works

NilsNiggemann commented 11 months ago

Another take at this: In my (currently private) development branch of SpinFRGLattices, I have changed the geometry struct to be slightly more general by allowing for NLen to be a non-integer.

(This is useful in some more complicated lattices, in which it is better to define the system size via a sphere of continuous radius NLen.)

Of course, this breaks JLD2s serialization format and hence throws an error in the unit tests.

This is a bit of a problem, since I would consider this redefinition a non-breaking change. It seems to me that these regression tests will always remain very sensitive to the changes in the internals of dependencies. Do you think it might be better to define a more explicit version of getXBubble that does not take the full workspace and perform the checks only on those values that are actually changed? We can still leave a convenience method getXBubble!(Workspace,...) = getXBubble!(Workspace.X, Workspace.State, ...) that is not part of the regression tests.

mmesiti commented 10 months ago

Do you think it might be better to define a more explicit version of getXBubble that does not take the full workspace and perform the checks only on those values that are actually changed?

There is a subtle balance between two aspects: readability and testability on one side, and (some) flexibility on the other. For readability and testability, it is good to have a function depend only on the data that it really needs. In that way it is clearer what the function does exactly, what it does, and it is easier to build test cases (not only for regression tests, but also unit tests). But if at some point you realize that your function needs another argument, or could do something additionally more efficiently than having a separate function, then just passing the whole state of the program to it gives you the maximum flexibility.

I tend to prefer readability&testability to (some) flexibility, because I think that most of the time one can achieve flexibility in other ways. But I imagine that readability for you likely does not matter much, because you have the domain knowledge to know exactly what that function does (and you wrote the code), and the testability aspect looks important now because we have these pesky tests for that function, because I needed them to make sure the parallelization I made does not break the code.

Anyway, from my point of view this would be a good refactoring.

mmesiti commented 10 months ago

Do you think it might be better to define a more explicit version of getXBubble that does not take the full workspace and perform the checks only on those values that are actually changed

I can try to make this change myself if you like

NilsNiggemann commented 10 months ago

Readability does matter to me and I agree with you that changing this likely improves upon it. I think this change also does not significantly hinder development speed, so it is probably a good idea to implement this change.

I think the testability is perhaps even more important, in particular I am facing more issues with tests failing locally as there seem to be some issues reading the Workspace. I'd really like resolving this issue before merging this PR, so if you could take a look at it that would be great!

mmesiti commented 10 months ago

I have a preliminary draft here: https://github.com/mmesiti/PMFRG.jl/tree/getXBubble-testability-refactor-1

It is not yet tested (*), before I spend more time on this I wanted to ask if this is what you had in mind.

In this draft, I have taken all the callees of getXBubble! (not getXBubble! itself yet), and made so that the parameter list contains only quantities that are actually needed by every function. I moved everything to kwargs (or at least that was the plan). The result is that now all the functions have a lot of arguments. Some considerations on this:

It is very clear, but it looks a bit like it has been written by a maniac (or written in Fortran77). Reading such a function signature is fine when one wants to know the details, but it is not fine if one wants to have a bird's eye view of what the function does.
When the Workspace parameter was there, the types of the objects inside it could be trivially inferred . Now this is not the case any more. Would this case any issue?

Only in the case where all the members of a structure are used, I pass it as a whole. This happens rarely, but it happens. We could relax this rule, and pass the whole structure to the function even if not all the members are used. My conjecture is that this could be done safely for those types that PMFRG owns, because for those types coupling should not be a problem. For the types that PMFRG does not own (e.g., the ones that come from SpinFRGLattices, i.e. the System, right?) I think it would be best to split them, to reduce the coupling instead.

What do you think?

(*) : The reason why it is not tested is that I would like, ideally, to make small changed and run the regression tests every time. Unfortunately this would be very slow. So what I have done is to make a lot of small commits that should in principle all work, plan to test only the last one (maybe we're lucky and it passes) and then plan to use git bisect in the very likely case that it does not work, to find the commit that broke the code. When I find it, fix the commit and rebase all the later commits on the fix. In this way, instead of $N_{commits}$ runs of the test suite, I need only $log2(N{commits})$, so I can keep the commits as small as it is comfortable for me.

NilsNiggemann commented 10 months ago

Thanks for the effort! However I think the lack of inferability will likely be a problem, likely with big performance regressions: The reason is that Julia does not infer keyword arguments (to improve compiler performance). As a result the compiler will not know the types inside of addX! and cause a lot of allocations each time the function is called (actually this should probably break a unit test)

I was thinking more along the lines of extracting from the workspace what is actually mutated and leaving the params as an argument - Personally, I think this is possibly the most readable variant as one does not get overwhelmed with all the arguments. Perhaps more like this:

function getXBubblePartition!(X,State,Deriv,Lam,Buffer,Par, isrange,  itrange,  iurange)
    [...]
end

This would make it more transparent, which object will actually be mutated, namely only X. (Not completely sure which order of the other arguments is more sensible though) We could then change the unit tests by converting X to an bigger array (I want to store X like this anyway in the future) something like this would avoid serialization that breaks too easily (not tested whether it works):


convertXToArray(X::BubbleType) = stack((X.a,X.b,X.c,X.Ta,X.Tb,X.Tc),dims = 5)
saveX(X) = h5write("X.h5","X",convertXToArray(X))
readX(file) = BubbleType(eachslice(h5read(file,"X"),dims = 5)...)

If we are very paranoid , we could also simply check whether any of the other states are mutated I suppose. What do you think?

mmesiti commented 10 months ago

Ok, this makes sense. It will require some important changes to the regression test logic but it will make them more robust.

In particular, I will need to save and read at least X, State and Deriv. The parameters can be re built from code instead, which might still break if the constructors change, but can be fixed much more easily.

I can try to make these changes today.

NilsNiggemann commented 10 months ago

Thanks! I think if the parameters get too complex, we can perhaps ignore them for this PR. In general there should never be anything in the code that modifies the params, so I think it is anyway better to implement an independent test that verifies this:

using PMFRG, SpinFRGLattice,Test
for type in [:Tuple,:UnitRange,:AbstractArray,:Number,:Bool,:String,:Symbol,:Nothing,:Missing]
    @eval IsEqualRecursive(x1::$type,x2::$type) = x1 == x2
end

function IsEqualRecursive(x1::T,x2::T) where T
    result = true
    for k in fieldnames(typeof(x1))
        result = result && IsEqualRecursive(getfield(x1,k),getfield(x2,k))
    end
    return result
end

IsEqualRecursive(x1::T1,x2::T2) where {T1,T2} = false 

function TestParams(;kwargs...)
    Par = Params(getPolymer(2);N=5,kwargs...)
    Par_comparison = Params(getPolymer(2);N=5,kwargs...) #the same parameters

    @testset "Params" begin
        @testset "before run" begin 
            @test IsEqualRecursive(Par,Par_comparison)
        end
        SolveFRG(Par)
        @testset "after run" begin 
            @test IsEqualRecursive(Par,Par_comparison)
        end
    end

end
TestParams()

Perhaps this should be included in the tests that I am doing anyway. Do you think this would be good? Edit: It would probably be better if there was a way to find out whether == is defined for a type or if it just falls back to isequal (i.e. ===). Not sure how to do that though

mmesiti commented 10 months ago

I am not sure you would get discover any real bug with such a test (writing code that change the parameters requires a specific intent to break things anyway), but if it is quick enough to write, it might be worth it. The problem of equality of structs in Julia seems to be well known, do you think in this case using https://github.com/jolin-io/StructEquality.jl would be helpful?

NilsNiggemann commented 10 months ago

Oh, I did not know about this package, seems interesting. Although it requires to change the struct definition with a macro and is probably overkill here.

(writing code that change the parameters requires a specific intent to break things anyway),

Yeah, you're probably right. I think I am fine with leaving this out of the tests for now. Let's focus on testing the X only.

mmesiti commented 10 months ago

Here https://github.com/mmesiti/PMFRG.jl/tree/getXBubble-testability-refactor-2 I have changed getXBubble! to have the signature:

function getXBubble!(X,State,Deriv,Lam,Buffer,Par, isrange,  itrange,  iurange)
    [...]
end

Callees (and their tests) have been changed accordingly, removing the need to have a Workspace parameter.

Here instead https://github.com/mmesiti/PMFRG.jl/tree/getXBubble-testability-refactor-3 I have changed only getXBubblePartition! to have the signature suggested:

function getXBubblePartition!(X,State,Deriv,Lam,Buffer,Par, isrange,  itrange,  iurange)
   [...]
end
Also in this case, callees and their tests have been changed accordingly, removing the need to have a Workspace parameter (this required less code changes, although an additional step was needed to create the regression tests, and we still have the clarity problem - that's why I went for the other approach first).

In both cases, the regression tests use exclusively HDF5 to save the whole regression test data set (using H5Zblosc for compression, saving 30% of disk space - the data set is now only 1.5MB.

Regarding this:

() : The reason why it is () : The reason why it is not tested is that I would like, ideally, to make small changed and run the regression tests every time. Unfortunately this would be very slow. not tested is that I would like, ideally, to make small changed and run the regression tests every time. Unfortunately this would be very slow.

It turned out that in some cases, an error can make threading codes hang indefinitely. I have not investigated the issue more deeply. Once the code was corrected, I could use the regression tests for extremely quick turnaround, and it was pretty easy to do the changes without relying on version control tricks.

I can try to make these changes today.

In some place in the world, it's still today.

NilsNiggemann commented 10 months ago

Looks great to me thanks! I think we can merge this to MPI and then I can have a final look at the PR and hopefully merge this weekend. Perhaps the serialization in the unit tests are a bit on the verbose side, although I see the readability merit of doing it this way. That brings me to the final point:

and we still have the clarity problem - that's why I went for the other approach first.

I do not quite understand the clarity problem you are refering to here. Is it the fact that the functions take slightly more abstract "State" types and Parameters instead of unwrapping everything into types that are defined in Base Julia? I agree that the definition of a Workspace which wraps everything was probably too much, but I find the State type quite harmless and actually fitting for the theme of the ODE state. In any case though I think this solution looks more readable than what I had before, so If you are fine with it, I would be happy to merge this. Again thanks for the great work!

mmesiti commented 10 months ago

I do not quite understand the clarity problem you are refering to here. Is it the fact that the functions take slightly more abstract "State" types and Parameters instead of unwrapping everything into types that are defined in Base Julia?

I was referring to the fact that in version 3 getXBubble still depends on the whole Workspace. But after all, it most of its members anyway. I think it is reasonable to use the whole State there (even if not all its members are used).

mmesiti commented 10 months ago

I do not quite understand the clarity problem you are refering to here. Is it the fact that the functions take slightly more abstract "State" types and Parameters instead of unwrapping everything into types that are defined in Base Julia?

I was referring to the fact that in version 3 getXBubble still depends on the whole Workspace. But after all, it most of its members anyway. I think it is reasonable to use the whole State there (even if not all its members are used).

Edit: i agree that the State and Deriv abstractions are very fitting for the ODE solver, which is the central part of the code, and the framework you are using sort of dictates that you use them. They are not be the best abstraction for the calculation of X, just for the fact that getXBubble does not use all its members. It does not even compute $\tilde{X}_d$, right? So even X itself is not the best abstraction one can imagine for this function. But creating new structures for each computation might be overkill. After all, it is not hard to understand what is happening, which is the ultimate definition of clarity.

mmesiti commented 10 months ago

Anyway, in the next few hours I will merge https://github.com/mmesiti/PMFRG.jl/tree/getXBubble-testability-refactor-3 into the Mpi branch. If you prefer the other solution, we can just revert the last commit.

NilsNiggemann / PMFRG.jl

LFS/Serialization: necessary or not? #10