mulias / roc-array2d

Fixed size 2D arrays for Roc
ISC License
5 stars 0 forks source link

X and Y is inverted #5

Open lindskogen opened 10 months ago

lindskogen commented 10 months ago

Not sure if this is just personal preference, and I know people will disagree between different applications, such as math (positive Y is up) vs computer graphics (positive Y is down).

But I think we agree on that X should be cols and Y should be rows? Is there a reason for them being inverted?

mulias commented 10 months ago

This is a good question, thanks for asking. Basically I agree that the current API is not right and I'd like to change it, but I don't think it's as simple as swapping X and Y. I would like this library to be useful in the 2d graphics case you're talking about, but there are other contexts where I think using X for cols and Y for rows is more confusing than helpful. In particular for matrix math you usually talk about an "M by N" matrix, where M is rows and N is cols. In that context "Y by X" doesn't sound right.

I'm not really an expert on this kind of thing, but one thing I've noticed looking at similar libraries in other languages is that most libraries don't use X and Y at all. Here are three examples I found helpful, if you've got others you know of please share:

In all three of these cases dimensions are referred to in terms of rows and cols, and indices use positional tuples.

Keeping all that in mind, here are the options I've come up with. I'd be interested to hear about which ones make sense to you, or if there's another option that I'm overlooking:

  1. Keep X as rows and Y as cols, make a note that if you're used to 2d graphics this won't work how you expect. I don't think this is a good idea.
  2. Swap so X is cols and Y is rows, make a note that if you're used to mathematical matrices this won't work how you expect. Potentially change the shape type to Shape : { rows: Nat, cols: nat }. This is probably an ok way to go, since matrix math is generally done in terms of the entire matrix at once, so indexing specific positions isn't as important.
  3. Change index type to Index : (Nat, Nat) for rows and columns. This sidesteps the issue, since you can chose to interpret the values differently depending on context. Tuples are how all three of the libraries above deal with indices. I don't love having to use index.0 and index.1 to get the rows and cols.
  4. Change index type to Index : { row: Nat, col: Nat }. It's a little more verbose, but is nicer to read than the tuple option. One reason why I didn't go with this initially is that I was thinking about expanding the library for 3d arrays, and I wasn't sure what the 3rd dimension would be called. I just dug into this a bit more and found some resources that suggest depth, which could be worse!

What do you think? I'm open to option 1 but I think I'm more interested in trying out 4.

edit: I meant to say I'm open to option 2, not 1!

lindskogen commented 10 months ago

Thank you for the detailed reply. It really shows that you’ve put a lot of thought into making the library. I think my favorite option is 4, we side-step the issue by not using x/y at all. And more in the spirit of roc to use verbose notation. (See List.range).

But maybe there is another alternative way to use tags (like List.range) to give the coordinates in different ways.

Array2D.get (X 20, Y 0) Or Array2D.get (Row 20, Col 0)

ryanb commented 10 months ago

@mulias thanks for the detailed writeup.

I'm not a fan of option 4 because I think of row as representing a list of elements, not a number. I realize it's an attribute on index, but index.row is also confusing to me.

I prefer option 3: tuples. It supports 3 dimensions and you can use pattern matching to name the variables: Array2D.mapWithIndex array \element, (x, y) -> ...

Update: You'll still need to pick a name for functions, such as flipX and flipY, for those I prefer flipRows and flipCols or flipColumns.

mulias commented 10 months ago

That's a good point, I hadn't considered that pattern matching on a tuple would let you name the component parts whatever you'd like.

Worth noting In the example Array2D.mapWithIndex array \element, (x, y) -> ... you're assigning x to rows and y to columns. To do it in the graphics way the pattern would have to be (y, x). I think though that if none of the library functions reference x/y then it would be down to user preference, (x,y) is fine as long as your code is internally consistent.

ryanb commented 10 months ago

Good point about (y, x), looks like the other libs linked use row then column as the order for their indexes too. The order is still tricky but at least it’s possible to name them.

mulias commented 10 months ago

I think it comes down to the user's expectations. To me (y, x) feels awkward, but maybe for someone who's used to thinking in that way it seems perfectly reasonable. For the alternative where you choose to do (x, y) and be consistent about x being rows and y being columns, I think it would work fine until you try to print using Array2D.joinWith and get a result that's "sideways". One solution to that would be to replace Array2D.joinWith with Array2D.joinRows/Array2D.joinCols.

ryanb commented 10 months ago

@mulias I agree, (y, x) feels awkward, but less awkward to me than using x for rows and y for cols. Perhaps I'm too used to working with screen coords and not the mathematical side of a matrix.

I suppose the issue with (y, x) is it has the potential to introduce difficult bugs if you ever reverse it. It would be an easy mistake to make.

I'm starting to come around to { row : Nat, col : Nat } because you can still patten match to x/y if you want (I think?): \{ row: y, col: x } then it's more explicit and less chance to introduce bugs.

Update: After briefly looking into how a matrix is used in mathematics, I don't see a consistent pattern for what x and y represent. It seems fairly common to have a matrix with two rows for mapping coordinates on a grid, the first row is the x coordinate, the second row is the y coordinate. If you're doing this it would be confusing for x and y to have different meanings for representing a position in the matrix. For this reason I think x and y should be avoided in the library and use row and col because it's explicit and it is a common naming pattern in both fields.