Closed mmesiti closed 10 months ago
Problems with CI related to LFS (https://github.com/mmesiti/PMFRG.jl/tree/CI-LFS, WIP)
A possibly hard problem with the current serialization/deserialization approach for the current regression tests is that since the tested function takes a Workspace as an argument, any change in one of the data structures in a Workspace (in particular, the ones related to #11) causes the regression tests to fail (as mentioned here)
Yes I think these kinds of problems should be avoided as much as possible. The tests should better not break upon modifying internals only. I think instead of serialising the workspace type, we should check against the actual array types that it contains.
Or: What do you think of performing some simple checks involving only some elements of the returned values, i.e.
getXBubble!(workspace,...)
@test workspace.X.c[1,2,3,5] == ...
Another interesting idea: Perhaps we could implement some kind of "hashing" of the underlying arrays. Using an actual hash is probably an issue due to the same reasons. But if we simply check the sum(abs,X.a) etc, we can probably also catch essentially all problems and we will also have a bit of tolerance in case there are some changes within floating point accuracy.
I have been imprecise: the reason why the regression test fails now is deserialization (here https://github.com/mmesiti/PMFRG.jl/blob/MPI-ext/test/RegressionTests/PMFRG.getXBubble.jl#L5, or https://github.com/mmesiti/PMFRG.jl/blob/CI-LFS/test/RegressionTests/PMFRG.getXBubble.jl#L5).
Of course, if we write to disk only the bits that we care about instead of the whole worspace it could be much better. The attractiveness of just serializing the tuple of arguments to getXBubble! resides in the fact that it's super simple and can be done automatically. Anyway, these are the possible solutions for this:
I think that 2 (or perhaps 3) are the most reasonable things to do. Regarding the "hashing" idea, it's interesting to test the output, but what about the input? Generating a valid, realistic input instead of recording it could also be a solution.
(TO BE CONTINUED)
After reverting the change to OneLoopParams
in 58c8514, the de-serialization does not fail any more. Another good side effect of decoupling. But still the problem remains, although less pressing.
just for the record, I tried this
- Change Recorder so that it serializes in a human-readable way (e.g., by writing code instead of a binary file).
which seems a good idea in general for Recorder, to record data in code that can be read and modified by a human.
I have converted the data to a string using repr
, and then formatted it. The process was extremely slow, requiring 2200 seconds for Nlen=4 and N=8 (15MB of text - autoformatters are not made for that), and eventually the obtained string could not be parsed because of the following error:
ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type sumElements
because the representation of the data contains things like
sumElements[1 2 3]
which apparently cannot be parsed correctly.
What I did, more or less:
data = deserialize("PMFRG.getXBubble.data")
datastr = repr(data) # this returns all data in a single line, which is not really wieldly
open("datarepr.jl",write=true) do f
write(f,datastr)
end
using JuliaFormatter
@time format("datarepr.jl")
# 2155.313091 seconds (30.04 M allocations: 1.889 GiB, 0.03% gc time, 0.00% compilation time)
datastr_formatted = read("datarepr.jl",String)
data_again = eval(Meta.parse(datastr_formatted))
# ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type sumElements
This seems to me a bit like reinventing the wheel. I don't think we can ever get more sophisticated with our solution that JLD2. Imo, Recorder.jl could be used more as a quick on-the-fly development tool. Maybe to generate unit tests, one can reduce the functionaliuty to using only arrays and use HDF5/CSVs?
I don't think we can ever get more sophisticated with our solution that JLD2.
Prompted by this comment (and other praises I've heard of JLD2 previously), I checked JLD2 and found that in addition to the benefit of not breaking between Julia versions, it can also deal with issues when datatypes change (https://juliaio.github.io/JLD2.jl/dev/advanced/#Explicit-Type-Remapping), which is exactly what I was concerned about when I wrote
- Write (de)serialization library that can allow users to specify data migrations (totally impractical)
to be fair, that's exactly what you said here
Imo, Recorder.jl could be used more as a quick on-the-fly development tool.
That's exactly the idea. I can switch there from Serialization to JLD2, and most of our problems should be fixed.
Maybe to generate unit tests, one can reduce the functionaliuty to using only arrays and use HDF5/CSVs?
Saving to disk and reading only the arrays would make sense as well, but to me it looks like this could take a lot of time to study the code. getXBubble!
takes as an argument the whole Workspace, so I would have to analyse what is in a workspace, what is relevant and I should track, and what is not relevant and I can fake (to create a "fake" workspace, just for testing). I thought that serializing the whole workspace was conceptually much simpler, general and much less error prone.
I also assumed that passing the whole Workspace to getXBubble! can be desirable since it does not constrain further development (you just pass everything the function might need), although this tends to go against the philosophy of interface segregation which typically makes testability much simpler.
Also, given that JLD2 allows to specify data migrations, we might not need to do that.
Using N=5 instead of N=7 to keep the size even smaller (2.2MB), since we know N=5 now works
Another take at this: In my (currently private) development branch of SpinFRGLattices, I have changed the geometry struct to be slightly more general by allowing for NLen to be a non-integer.
(This is useful in some more complicated lattices, in which it is better to define the system size via a sphere of continuous radius NLen.)
Of course, this breaks JLD2s serialization format and hence throws an error in the unit tests.
This is a bit of a problem, since I would consider this redefinition a non-breaking change. It seems to me that these regression tests will always remain very sensitive to the changes in the internals of dependencies.
Do you think it might be better to define a more explicit version of getXBubble that does not take the full workspace and perform the checks only on those values that are actually changed?
We can still leave a convenience method getXBubble!(Workspace,...) = getXBubble!(Workspace.X, Workspace.State, ...)
that is not part of the regression tests.
Do you think it might be better to define a more explicit version of getXBubble that does not take the full workspace and perform the checks only on those values that are actually changed?
There is a subtle balance between two aspects: readability and testability on one side, and (some) flexibility on the other. For readability and testability, it is good to have a function depend only on the data that it really needs. In that way it is clearer what the function does exactly, what it does, and it is easier to build test cases (not only for regression tests, but also unit tests). But if at some point you realize that your function needs another argument, or could do something additionally more efficiently than having a separate function, then just passing the whole state of the program to it gives you the maximum flexibility.
I tend to prefer readability&testability to (some) flexibility, because I think that most of the time one can achieve flexibility in other ways. But I imagine that readability for you likely does not matter much, because you have the domain knowledge to know exactly what that function does (and you wrote the code), and the testability aspect looks important now because we have these pesky tests for that function, because I needed them to make sure the parallelization I made does not break the code.
Anyway, from my point of view this would be a good refactoring.
Do you think it might be better to define a more explicit version of getXBubble that does not take the full workspace and perform the checks only on those values that are actually changed
I can try to make this change myself if you like
Readability does matter to me and I agree with you that changing this likely improves upon it. I think this change also does not significantly hinder development speed, so it is probably a good idea to implement this change.
I think the testability is perhaps even more important, in particular I am facing more issues with tests failing locally as there seem to be some issues reading the Workspace. I'd really like resolving this issue before merging this PR, so if you could take a look at it that would be great!
I have a preliminary draft here: https://github.com/mmesiti/PMFRG.jl/tree/getXBubble-testability-refactor-1
It is not yet tested (*), before I spend more time on this I wanted to ask if this is what you had in mind.
In this draft, I have taken all the callees of getXBubble!
(not getXBubble!
itself yet), and made so that the parameter list contains only quantities that are actually needed by every function. I moved everything to kwargs (or at least that was the plan). The result is that now all the functions have a lot of arguments. Some considerations on this:
Only in the case where all the members of a structure are used, I pass it as a whole. This happens rarely, but it happens. We could relax this rule, and pass the whole structure to the function even if not all the members are used. My conjecture is that this could be done safely for those types that PMFRG owns, because for those types coupling should not be a problem. For the types that PMFRG does not own (e.g., the ones that come from SpinFRGLattices, i.e. the System, right?) I think it would be best to split them, to reduce the coupling instead.
What do you think?
(*) : The reason why it is not tested is that I would like, ideally, to make small changed and run the regression tests every time. Unfortunately this would be very slow. So what I have done is to make a lot of small commits that should in principle all work, plan to test only the last one (maybe we're lucky and it passes) and then plan to use git bisect
in the very likely case that it does not work, to find the commit that broke the code. When I find it, fix the commit and rebase all the later commits on the fix. In this way, instead of $N_{commits}$ runs of the test suite, I need only $log2(N{commits})$, so I can keep the commits as small as it is comfortable for me.
Thanks for the effort! However I think the lack of inferability will likely be a problem, likely with big performance regressions: The reason is that Julia does not infer keyword arguments (to improve compiler performance). As a result the compiler will not know the types inside of addX! and cause a lot of allocations each time the function is called (actually this should probably break a unit test)
I was thinking more along the lines of extracting from the workspace what is actually mutated and leaving the params as an argument - Personally, I think this is possibly the most readable variant as one does not get overwhelmed with all the arguments. Perhaps more like this:
function getXBubblePartition!(X,State,Deriv,Lam,Buffer,Par, isrange, itrange, iurange)
[...]
end
This would make it more transparent, which object will actually be mutated, namely only X. (Not completely sure which order of the other arguments is more sensible though) We could then change the unit tests by converting X to an bigger array (I want to store X like this anyway in the future) something like this would avoid serialization that breaks too easily (not tested whether it works):
convertXToArray(X::BubbleType) = stack((X.a,X.b,X.c,X.Ta,X.Tb,X.Tc),dims = 5)
saveX(X) = h5write("X.h5","X",convertXToArray(X))
readX(file) = BubbleType(eachslice(h5read(file,"X"),dims = 5)...)
If we are very paranoid , we could also simply check whether any of the other states are mutated I suppose. What do you think?
Ok, this makes sense. It will require some important changes to the regression test logic but it will make them more robust.
In particular, I will need to save and read at least X, State and Deriv. The parameters can be re built from code instead, which might still break if the constructors change, but can be fixed much more easily.
I can try to make these changes today.
Thanks! I think if the parameters get too complex, we can perhaps ignore them for this PR. In general there should never be anything in the code that modifies the params, so I think it is anyway better to implement an independent test that verifies this:
using PMFRG, SpinFRGLattice,Test
for type in [:Tuple,:UnitRange,:AbstractArray,:Number,:Bool,:String,:Symbol,:Nothing,:Missing]
@eval IsEqualRecursive(x1::$type,x2::$type) = x1 == x2
end
function IsEqualRecursive(x1::T,x2::T) where T
result = true
for k in fieldnames(typeof(x1))
result = result && IsEqualRecursive(getfield(x1,k),getfield(x2,k))
end
return result
end
IsEqualRecursive(x1::T1,x2::T2) where {T1,T2} = false
function TestParams(;kwargs...)
Par = Params(getPolymer(2);N=5,kwargs...)
Par_comparison = Params(getPolymer(2);N=5,kwargs...) #the same parameters
@testset "Params" begin
@testset "before run" begin
@test IsEqualRecursive(Par,Par_comparison)
end
SolveFRG(Par)
@testset "after run" begin
@test IsEqualRecursive(Par,Par_comparison)
end
end
end
TestParams()
Perhaps this should be included in the tests that I am doing anyway. Do you think this would be good? Edit: It would probably be better if there was a way to find out whether == is defined for a type or if it just falls back to isequal (i.e. ===). Not sure how to do that though
I am not sure you would get discover any real bug with such a test (writing code that change the parameters requires a specific intent to break things anyway), but if it is quick enough to write, it might be worth it. The problem of equality of structs in Julia seems to be well known, do you think in this case using https://github.com/jolin-io/StructEquality.jl would be helpful?
Oh, I did not know about this package, seems interesting. Although it requires to change the struct definition with a macro and is probably overkill here.
(writing code that change the parameters requires a specific intent to break things anyway),
Yeah, you're probably right. I think I am fine with leaving this out of the tests for now. Let's focus on testing the X only.
Here https://github.com/mmesiti/PMFRG.jl/tree/getXBubble-testability-refactor-2 I have changed getXBubble!
to have the signature:
function getXBubble!(X,State,Deriv,Lam,Buffer,Par, isrange, itrange, iurange)
[...]
end
Callees (and their tests) have been changed accordingly, removing the need to have a Workspace parameter.
Here instead https://github.com/mmesiti/PMFRG.jl/tree/getXBubble-testability-refactor-3 I have changed only getXBubblePartition!
to have the signature suggested:
function getXBubblePartition!(X,State,Deriv,Lam,Buffer,Par, isrange, itrange, iurange) [...] end
Also in this case, callees and their tests have been changed accordingly, removing the need to have a Workspace parameter (this required less code changes, although an additional step was needed to create the regression tests, and we still have the clarity problem - that's why I went for the other approach first).
In both cases, the regression tests use exclusively HDF5 to save the whole regression test data set (using H5Zblosc for compression, saving 30% of disk space - the data set is now only 1.5MB.
Regarding this:
() : The reason why it is () : The reason why it is not tested is that I would like, ideally, to make small changed and run the regression tests every time. Unfortunately this would be very slow. not tested is that I would like, ideally, to make small changed and run the regression tests every time. Unfortunately this would be very slow.
It turned out that in some cases, an error can make threading codes hang indefinitely. I have not investigated the issue more deeply. Once the code was corrected, I could use the regression tests for extremely quick turnaround, and it was pretty easy to do the changes without relying on version control tricks.
I can try to make these changes today.
In some place in the world, it's still today.
Looks great to me thanks! I think we can merge this to MPI and then I can have a final look at the PR and hopefully merge this weekend. Perhaps the serialization in the unit tests are a bit on the verbose side, although I see the readability merit of doing it this way. That brings me to the final point:
and we still have the clarity problem - that's why I went for the other approach first.
I do not quite understand the clarity problem you are refering to here. Is it the fact that the functions take slightly more abstract "State" types and Parameters instead of unwrapping everything into types that are defined in Base Julia? I agree that the definition of a Workspace which wraps everything was probably too much, but I find the State type quite harmless and actually fitting for the theme of the ODE state. In any case though I think this solution looks more readable than what I had before, so If you are fine with it, I would be happy to merge this. Again thanks for the great work!
I do not quite understand the clarity problem you are refering to here. Is it the fact that the functions take slightly more abstract "State" types and Parameters instead of unwrapping everything into types that are defined in Base Julia?
I was referring to the fact that in version 3 getXBubble
still depends on the whole Workspace
. But after all, it most of its members anyway.
I think it is reasonable to use the whole State there (even if not all its members are used).
I do not quite understand the clarity problem you are refering to here. Is it the fact that the functions take slightly more abstract "State" types and Parameters instead of unwrapping everything into types that are defined in Base Julia?
I was referring to the fact that in version 3 getXBubble
still depends on the whole Workspace
. But after all, it most of its members anyway.
I think it is reasonable to use the whole State there (even if not all its members are used).
Edit: i agree that the State and Deriv abstractions are very fitting for the ODE solver, which is the central part of the code, and the framework you are using sort of dictates that you use them. They are not be the best abstraction for the calculation of X, just for the fact that getXBubble does not use all its members. It does not even compute $\tilde{X}_d$, right? So even X itself is not the best abstraction one can imagine for this function. But creating new structures for each computation might be overkill. After all, it is not hard to understand what is happening, which is the ultimate definition of clarity.
Anyway, in the next few hours I will merge https://github.com/mmesiti/PMFRG.jl/tree/getXBubble-testability-refactor-3 into the Mpi branch. If you prefer the other solution, we can just revert the last commit.
(continuation of https://github.com/mmesiti/PMFRG-copy.jl/issues/3)
Will the package manager also download the LFS objects?
Some considerations: