JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.65k stars 5.48k forks source link

nnz and nonzeros #6769

Closed eldadHaber closed 10 years ago

eldadHaber commented 10 years ago

Someone has decided to change nnz and to remove the nonzeros command.

I find this very unacceptable! It is a bad practice to not allow for backward compatibility. Many of our codes use these commands and we have to go and replace them in all our previous versions.

We are investing a considerable amount of time, trying to develop codes in julia for geophysical inversion and truly believe that it is the way to go. If new julia versions are not going to be backward compatible we will need to rethink our development.

Please think about this whenever you want to retire a command

JeffBezanson commented 10 years ago

Sorry about this. You can add the following to your code:

const nnz = countnz
Base.nonzeros(s::SparseMatrixCSC) = nonzeros(s.nzval)

and then you won't need to change every use. Wrapping this in if !method_exists(nonzeros,(SparseMatrixCSC,)) will let the code work with both 0.2 and 0.3.

Personally I dislike the decision to allow explicit zeros in sparse matrices, but I think we should bring back the definition of nonzeros for sparse.

ViralBShah commented 10 years ago

I find myself wanting to use nnz every once in a while. While we can retain stored zeros, it would be nice to have nnz and nonzeros.

eldadHaber commented 10 years ago

My problem is not that changes are done (I'm also not a fan of this change but this is not a big deal). My problem is backward compatibility. We now have a relatively large investment in a julia library for geophysical inverse problems. If today a decision is made to retire a command tomorrow it may be a different one. We cannot commit on building code when basic commands are changed.

My request for the future, please

  1. Try to not retire a command
  2. If you have a new command to do the same thing better, link the two (maybe with a recommendation to use the new one)
  3. Documentation! - There is no documentation about nonzeros not in use. The first time I got this is when I try to run a code that ran in the past

Thanks

tknopp commented 10 years ago

Unless Julia has version 1.0 I don't see the issue when the API is evolving. Yes this can be painful but trying to be backward compatible in all the 0.x releases will not lead to a consistent 1.0 release.

We all know that the documentation can be improved but this is a community project. There is nobody you can blame because of missing / wrong documentation. The project lives from participation.

Independently of this specific issues I think there will be more breaking changes on the way to 1.0. If this is not ok for you (which is absolutely understandable) you might better try a programming environment that is more stable (e.g. Matlab).

JeffBezanson commented 10 years ago

Please continue to file issues like this for specific changes that cause problems. There is always a chance something got removed by mistake, or that we will change our minds.

nonzeros seems to have been botched a bit --- in all other cases (including nnz) we add deprecation warnings that let code continue to work. Also adding your own aliases and work-arounds like those I posted above has no overhead, so in some cases that will be the right approach.

simonster commented 10 years ago

I think we just changed the method signature for nonzeros so that it only applies to StridedArray in c330b1d64efa566911cdd20921ca9419e58bacf9, hence the lack of a deprecation warning. The existing method was inefficient for sparse matrices, but it could be implemented efficiently by adding the definition above.

JeffBezanson commented 10 years ago

Yes, but there still should have been a deprecation warning or error for the removed functionality. It was an oversight.

tkelman commented 10 years ago

And a highlight of not-very-good test coverage for sparse matrices. When this goes back in it should have a test so accidental removal without a deprecation gets caught in the future.

JeffBezanson commented 10 years ago

Well, the removal wasn't accidental. The tests were updated, but not the deprecations. Better test coverage would of course help in general with this kind of thing though.

ViralBShah commented 10 years ago

Nnz does have a deprecation. That is how I keep getting reminded often that I like the name. Perhaps nonzeros may have been missed. Anyways, I will revisit this would matter soon and iron out the issues.

ViralBShah commented 10 years ago

@tkelman You are certainly right about the lack of tests and coverage for sparse matrices. Hoping to significantly improve sparse matrix support over the next few weeks.

ViralBShah commented 10 years ago

Cc: @mlubin @iainnz

mlubin commented 10 years ago

I don't think we should restore nnz, since it's so tempting to use yet would be the wrong choice for the majority of code that works with sparse matrices. As @tkelman mentioned in #6769, the representation details of sparse matrices are very important for writing efficient code. Users should have to explicitly choose whether they want the storage size of the matrix or the exact number of nonzero elements, which requires looping through the matrix.

Sorry @eldadHaber for the churn. The deprecation of nnz resulted from a change in the structure of SparseMatrixCSC to allow explicit nonzero entries in the matrix, which is useful in a number of applications. This means that the exact number of nonzeros cannot be computed in constant time anymore. We believed that an explicit warning would be better than silently changing an O(1) operation into a O(nonzeros) operation in users code.

mlubin commented 10 years ago

Rereading the comments in #6769, I'll defer to @JeffBezanson if the consensus is that it's better to silently change the performance of nnz.

JeffBezanson commented 10 years ago

My view is based on looking at the other nnz (now countnz) methods, which clearly count elements such that x!=0. Of course representation details are important for performance, but nfilled is not a drop-in replacement for nnz --- it doesn't mean the same thing.

mlubin commented 10 years ago

Right, it's not a drop in replacement, so the deprecation calls countnz. My point is that the name nnz is misleading for sparse matrices, because almost anyone reading the code who hasn't read the Julia documentation would assume that this is a constant-time operation on sparse matrices. The countnz name is much more explicit and properly describes what the function is actually doing.

mlubin commented 10 years ago

It's an application of the principle of least surprise.

JeffBezanson commented 10 years ago

I consider the meaning of the operation primary, not its performance. I guess I (admittedly not being a sparse expert) just don't see the value of trying to make people realize that nnz is O(n). If they need that function, there's nothing they can do about it, so what's the point?

Is it the case that knowing the actual number of !=0 elements is not useful, and 90% of the time people actually want the storage size? I suspect not, but if so then I might change my mind.

mlubin commented 10 years ago

In my experience when writing linear algebra operations with sparse matrices, nfilled is the quantity of interest, not nnz. Compare the number of times nfilled is used in base/sparse/*, which is 28 by my count, with number of times countnz is used, twice.

JeffBezanson commented 10 years ago

Maybe @eldadHaber can comment on how he uses nnz?

eldadHaber commented 10 years ago

I have a somewhat different perspective about nnz. First, for real and complex matrices, it does not make sense to me to ask the question nonzeos(A) != 0

As we teach in any numerical analysis course, zero is not really 0. For real matrices we should be reporting abs(nonzeros(A)) <= tol

So I fail to see what is the fuss about this command. When I use nnz(A) I want to know what is the storage that is roughly needed for some variable. The number of "true" nonzeros should be asked with a tol

mlubin commented 10 years ago

It seems that this perspective falls under the nfilled use case where you were using nnz to access the number of elements stored in the matrix, nonzero or otherwise, as opposed to counting the occurrence of exact nonzeros.

eldadHaber commented 10 years ago

I try to say that counting nonzeros (for real and complex matrices) does not make sense anyway. Sure we can have this functionality but it is misleading. If A has real values, most around 1 and 10% of the values are smaller than 1e-16 then you still report these values as nonzeros.

tkelman commented 10 years ago

@eldadHaber tolerancing and small numerical values are a separate question, that applies to any operation involving floating point numbers. Sparse algorithms are written in such a way that explicitly stored zero values don't change the output result (assuming you don't run into any overflow or nan).

I want the storage size when I'm writing a sparse algorithm - when explicit zeros need to be there, it's almost always for matrix structural reasons. For example Pardiso requires storage to already exist for elements on the diagonal - ideally these would be stored zeros when the language allows for them, but the Matlab interface to Pardiso requires hacky inaccurate workarounds like adding eps * speye(n,n). Prior to 2009 Petsc did something similar (see https://bitbucket.org/petsc/petsc/src/cb6910d74d53/bin/matlab/PetscBinaryWrite.m#cl-31) when saving sparse matrices from Matlab into its binary format, though slightly more reliably by using a small magic number and setting any element matching that magic number back to an explicit 0 before saving the binary file.

The Matlab interface to Ipopt is another case of this. The linear solver used within Ipopt, like the majority of sparse direct algorithms, separates the structure-based symbolic factorization allocation step from the value-based numerical factorization step. You need to evaluate the sparse Jacobian and Hessian matrices at every iteration, with the same sparsity structure but varying nonzero values. What happens if one of the "structurally nonzero" elements happens to have a value of exactly zero at some intermediate iteration? If your language automatically removes zeros from sparse matrices, you have to account for this by making a copy of the entire sparse matrix data structure at every iteration to put the nonzeros in the proper place. Bad times.

JeffBezanson commented 10 years ago

Thanks, that perspective is very helpful. In that case maybe we should use the general count function instead of either nnz or countnz. Or sum(abs(A) .>= tol) and the like.

mlubin commented 10 years ago

There was a lot of discussion on this in #5538, and I agree and have been arguing that countnz is a minority use case. That said, I don't think it's appropriate to have nnz return the value of nfilled, because, as @JeffBezanson mentioned, it's not the semantic meaning of the operation, which itself is general and is defined on other data structures. countnz could certainly be extended to use a numerical tolerance, but that's a separate issue from the function names.

eldadHaber commented 10 years ago

I think that the tolerance should be a used choice. I'm trying to construct an example to demonstrate some of these issue

JeffBezanson commented 10 years ago

It would make sense to add a tol argument to countnz.

What about nonzeros? It seems one might want it to just return the nzval array, and not filter out stored zeros. In that case it could use a rename as well.

timholy commented 10 years ago

@eldadHaber, just so you're aware: the "fuss" over this command arose because we, unlike Matlab, now allow you to store zeros in a sparse matrix. Check out the implications (following code is Matlab):

>> A = sprand(10^4, 10^4, 0.1);
>> [i,j] = ind2sub(size(A), find(A, 1));
>> tic; A(i,j) = 0.1; toc
Elapsed time is 0.000443 seconds.
>> tic; A(i,j) = 0.0; toc
Elapsed time is 0.037815 seconds.

Notice the second timing is one hundred fold longer than the first: if you set a non-zero entry to zero, Matlab actually copies the entire data for the matrix, deleting that one from memory. That's really expensive. Matlab does this because it doesn't allow you to have any stored value of a sparse matrix be identically zero.

In contrast, we allow stored zeros. That will* make this kind of thing infinitely more efficient. But that means there's a difference between the amount of storage required and how many entries are not identically zero. Given that matlab only has the nnz command, we realized we had better differentiate these two meanings.

But breaking old code is a no-no, and usually we are reasonably good about avoiding that because of the deprecation framework. But it seems to have failed in this case.

*Currently it looks like sparsematrix/setindex! needs updating, so it stops caring about whether the supplied value is 0.

tkelman commented 10 years ago

Looks like elementwise comparisons on sparse matrices are currently outputting dense bitarrays, which might not always be the best choice. So at the moment you'll want to use A.nzval (or nonzeros when that gets put back in, whatever its meaning) for any tolerancing operations where the implicit zero elements return false. Possibly worth specializing elementwise comparisons on sparse matrices to return sparse Boolean matrix results, that happen to be mostly full of true's if the user did an operation that gives a mostly-dense result?

mlubin commented 10 years ago

findnz should be used for filtering out the nonzeros, so I'd agree that nonzeros should be renamed and return the nzval array.

eldadHaber commented 10 years ago

@timholy I really like your example but I think its really a question of the application.

I can claim that for a sparse matrix that contains 0 values I would like to have a new sparse matrix and this can be used in the process of linear solvers and in the process of faster mat-vec products.

I think your argument is valid as well and I like the idea of letting the user decide if they want to generate a new "more economical" matrix (with these zeros removed) or using the same sparsity structure with some of the entries being 0. I can certainly see applications for both in finite volume/element discretization of PDE's.

As long as we document the two AND not retire functions that have been used before without a really good reason (where is nonzeros ...) I'm happy with both options

ViralBShah commented 10 years ago

There are two separate things going on simultaneously here, as already discussed. For general users, we try to maintain sparse matrices without stored zeros in them. But, for developers of libraries and sophisticated usage, we do allow stored zeros.

If we introduce a flag saying whether a sparse matrix contains stored zeros, we can effectively reintroduce nnz, which for all general usage will be fast enough. Only a few operations should need to deal with this flag. I feel that nonzeros should just give you a copy of nzval vector without doing a zero check, since general user generated sparse matrices should not contain zeros in them.

tknopp commented 10 years ago

I think nnz is not a great name for a function. We also have in Base count_zeros. Having countnz and count_zeros cries for some name unification. I know count_zeros is currently only available for counting the bits in an integer but beeing a generic function it could be implemented for any collection.

mlubin commented 10 years ago

Introducing a flag increases the complexity of the data structure and increases the cognitive load on anyone using it, since they essentially now need to deal with two different data structures whose methods have different behavior and performance. This doesn't seem worth it just to be able to use the name nnz, which couldn't even replace nfilled in Base because functions in Base need to accept matrices with stored zeros. So we're just talking about user code, where it's pretty easy to define

nnz = nfilled

if they really like the name.

JeffBezanson commented 10 years ago

@ViralBShah I'm not sure I buy that --- stored zeros are either allowed or not. The idea of "structurally nonzero" entries is interesting though. Since assigning zero does remove the entry by default, how do we deal with that use case?

I've never liked the names count_zeros and count_ones; they are quite non-standard for those bit operations. The bit operations should be distinct from any collection operations. Better names would be bitcount or popcount. I'm not sure there is a standard name for counting zero bits though.

eldadHaber commented 10 years ago

I actually think that nnz is a good command. It is used both in matlab and python's scipy so its only natural to use it here. We have to be less concern about being different, it makes the use of the language harder for newcomers.

I think that we should have a command nnz(A,tol) where nnz(A,tol) = find(abs(nonzeros(A)) <= tol)

So we have that nnz(A) = nnz(A,0) # of nonzeros stored in the array and if we are interested in the real number of nonzeros than we use tol.

This makes the use simpler, similar to other languages and can accomodate the idea of storing 0's in the matrix

eldadHaber commented 10 years ago

By the way, similar logic work for incomplete Cholesky

mlubin commented 10 years ago

Sparse matrices can store any type of object, including types which don't necessarily have a reasonable definition of abs, so baking the notion of a tolerance into the definition of nnz loses a lot of generality. This was also discussed in #5538 where nfilled was chosen partially because it is sufficiently generic and allows for future extensions to real-valued sparse matrices with nonzero "default" values.

kmsquire commented 10 years ago

Couldn't nnz then be restricted to AbstactArray{Number}?

On Thursday, May 8, 2014, Miles Lubin notifications@github.com wrote:

Sparse matrices can store any type of object, including types which don't necessarily have a reasonable definition of abs, so baking the notion of a tolerance into the definition of nnz loses a lot of generality. This was also discussed in #5538 https://github.com/JuliaLang/julia/pull/5538where nfilled was chosen partially because it is sufficiently generic and allows for future extensions to real-valued sparse matrices with nonzero "default" values.

— Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/julia/issues/6769#issuecomment-42633879 .

ViralBShah commented 10 years ago

@JeffBezanson The situation is not as black and white with stored zeros. In the entire sparse matrix library in Base, we filter out zeros on input and that are generated in operations. This is the usual expected behaviour for most users.

julia> s = sparse([1, 2], [1, 2], [1.0, 0.0])
2x2 sparse matrix with 1 Float64 entries:
    [1, 1]  =  1.0

The only way to get stored zeros is to use the SparseMatrixCSC constructors. Basically, this is only for folks who know the underlying details of the data structure and know what they are doing. This is typically for cases where the nonzero matrix structure is known upfront and fill-in may happen in a later operation. Given that we do support stored zeros and want to make it first class, we should probably also have a better interface to create matrices with stored zeros. For example, sparse could have a dispatch for stored_zeros=true.

@mlubin I disagree that we should be asking users to do nnz=nfilled, if they like the name. This is not a case of one user preferring one name, but a large number of users who are used to a certain set of names and behaviour, and this would ease the transition.

I do agree that the cognitive overload increases with a flag, and that should be avoided. It was just a thought.

Rethinking all of this, what if nfilled were renamed to nnz and we make it clear that it is the number of structural nonzeros as suggested above? nonzeros then returns all the structural nonzeros. In my opinion, one name is not necessarily better or clearer than the other. For users who do not care about stored zeros, they will never run into the issue. For those used to matlab and octave, the transition will be better with familiar names. For those who do need to use stored zeros, they will know that these are structural nonzeros.

ViralBShah commented 10 years ago

@eldadHaber As discussed here, the nnz is typically about structural nonzeros.

@mlubin @kmsquire We have a bunch of sparse capabilities that assume that it is a sparse matrix of numbers. It would be good to restrict certain stuff like linear algebra to numbers only, where the operations do not generalize further. What we really need is an N-d coo format, which will be far more general for storing objects in a sparse array container.

eldadHaber commented 10 years ago

I really like your proposal to return nnz and nonzeros

I am one of the users that like the idea of nonzero pattern (use it in FEM discretization of PDE's) but I think that you are right and it is for more sophisticated users. There is much to say about compatibility with matlab, scipy and octave

tknopp commented 10 years ago

Compatibility with matlab (and scipy) has not really been a strong reason for naming things when you look at other issues. The deprecation of max in favor of maximum is a recent example that in my opinion has a lot larger impact than the deprecation of nnz (and I am not happy about maximum but understand the reasoning behind it).

On further thing is that multiple dispatch quite naturally enforces different naming in various situations because the type that an operation is exposed to does not have to be encoded into the function name. I contributed some code to the Gtk.jl project and 90% of the time I overloaded existing methods (e.g. push! for adding something to a GtkListStore). I could have tried to use similar naming as in the Python Gtk wrapper (e.g. use add instead of push!) but this would not have been the Julian way (and Jameson would of course not have accepted the patch).

I actually don't have a strong feeling in the case of nnz. I just want to make clear that from what I have seen matlab compatibility has not been a very important thing in the past. @StefanKarpinski might correct me if I am wrong here. I just think it would be fair for Eldad that he does not have too high expectations on matlab compatibility.

PS. Searching this issue tracker for meshgrid also gives a discussion on matlab compatibility. Reference semantic on arrays vs COW in matlab is also a major difference.

ViralBShah commented 10 years ago

You are absolutely right - Compatibility with matlab or any other system is not an important driver for choosing names in Julia. However, we retain compatible names so long as they are meaningful, and not kludgy. We often change names when something is confusing or poorly named.

On balance, IMO, going with the structural nonzero rationalization, nnz and nonzeros are good names and behave as expected for all.

tknopp commented 10 years ago

Yes but you can't see nnz in isolation. It is not optimal that we have nnz, count_zeros and count_ones as generic functions in Base.

ViralBShah commented 10 years ago

nnz and count_ones/count_zeros are similar sounding operations, but I do not believe that these names need unification. The count_ones/count_zeros names are not great, and bitcount was what struck me as a better name.

mlubin commented 10 years ago

I'm okay with renaming nfilled to nnznow that the countnz functionality has been separated out.

JeffBezanson commented 10 years ago

Here's me trying to ruin the party as usual: if nonzeros means structural non-zeros, then do we have

nonzeros(A::Array) = A

?

mlubin commented 10 years ago

I'd argue that nnz shouldn't be defined for generic arrays, countnz should be used instead. nonzeros is less clear...