numberscope / frontscope

Numberscope's front end and user interface: responsible for specifying sequences and defining and displaying visualizers
MIT License
7 stars 15 forks source link

FactorFence Visualizer #406

Closed katestange closed 3 months ago

katestange commented 4 months ago

By submitting this PR, I am indicating to the Numberscope maintainers that I have read and understood the contributing guidelines and that this PR follows those guidelines to the best of my knowledge. I have also read the pull request checklist and followed the instructions therein.


This is a factor visualizer. It takes the terms of the sequence s(n) and creates a bar of height log(s(n)) at position n. The bar is divided into pieces of height log(p) for each prime p in the prime factorization (with multiplicity).

Some features:

  1. There's one "highlighted" prime, which is always shown at the bottom of the bars. If you click off the graph, highlight is removed. If you click on a prime, it switches that prime to the highlight role.
  2. If you mouse-over the graph, information on the bar and prime you are hovering above is shown below the graph
  3. You can use the mouse scrollwheel to zoom, and the arrow keys to move around; there are a few other special keys
  4. It treats signs separately (checkbox turns this on/off): the sign is removed before taking log and slapped back on afterward.

A few interesting ones:

http://localhost:5173/?name=5350&viz=FactorFence&terms=1000&highlight=0&seq=OEIS+A005350

http://localhost:5173/?name=abc+hits&viz=FactorFence&terms=1000&highlight=0&seq=OEIS+A130510

katestange commented 3 months ago

I am fine either way with your making the changes about standardizing the view, or doing it myself if you prefer. Just let me know.

If you go ahead that's great! Thank you!

gwhitney commented 3 months ago

want to rerun preSketch() on change of parameters ... Is this a situation we want to worry about at all?

I think since this seems to be an unusual situation, it's not too onerous in such a case just to expect that the visualizer writer will call presketch() "manually" in their custom parameterChanged() function.

Getting going on the standardize view changes to FactorFence. Will let you know when I have pushed it and it is ready for you to try out.

katestange commented 3 months ago

So.... my son and I were playing with factor fence and it occurred to me that.... the only actual purpose of the Number of Terms parameter is to decide how far to pre-grab factorizations. The visualizer doesn't actually draw all those bars until they are needed when they appear onscreen when zooming out. And in fact, it's just annoying from the user's point of view to decide ahead of time how far they might hope to zoom out, and have to up this parameter as needed. It's just distracting that the parameter choice exists at all. And something like the natural numbers as a formula or random number or whatever ought to let you zoom out as far as you want.

What it really ought to do is ask for more factorizations as you go (maybe staying ahead of you so you don't notice). Why should it make the user wait a while at the beginning in order to factor a bunch of stuff that isn't even on screen to begin with?

So maybe we should remove the Number of Terms parameter entirely, and implement some well-behaved background caching. Should that be an issue to file for later after this PR?

(Side note: This is a kind of funny thing to realize, because we did a lot of work to make that particular parameter behave well (e.g. the warnings capability), but of course that will be work well spent on other visualizers like turtle.)

Thoughts?

gwhitney commented 3 months ago

OK, I believe I have the view settings reverting to the "standard initial view" when we want them to (new sequence, new URL, or canvas resize) and not when we don't want them to (parameter/checkbox change). Please give it a try and let me know what you think.

As far as the terms param goes, I agree that it serves no purpose in the context of FactorFence. It can't hurt to be able to scroll through all of the available terms. If you never get way out to the high-index terms, it won't hurt you that you could. And in fact, FactorFence is already being a very good citizen with respect for waiting for terms/factorization to show up. So the only code change needed is to rip out the terms param, and all should be well. As usual, let me know if you are going to make the modification or you'd like me to do it.

katestange commented 3 months ago

I gave it to my six year old and he didn't succeed in crashing it! Your commit seems to work well. I noticed a few things that needed doing:

  1. Check box parameters should be consistent in whether they use question marks (I took them out)
  2. I removed the number of terms parameter and default to min(10000,available). It's pretty hard to zoom or scroll enough to see 10000 terms, but just in case anyone does, it will load more terms as long as possible, which may be forever, e.g. when using random or formula. By setting initialLimitTerms = 10 or something small, you can test this feature and experience the slight lag caused by extending the terms.
  3. images needed to be updated again
katestange commented 3 months ago

One final thing. The Thue-Morse sequence (A010060), or any other sequence made of 0,1,-1, is a special case. I noticed the text floats to the top in this case. I added a line of code to fix this.

katestange commented 3 months ago

Ok, as far as I know we are done? (Assuming I didn't break anything new.)

katestange commented 3 months ago

Darn my fix for Thue-Morse broke the Bernoulli numerators. A027641. Fixed the fix.

gwhitney commented 3 months ago

Ok, as far as I know we are done? (Assuming I didn't break anything new.)

When you bump up this.last, you should also call this.seq.fill(this.last) (but don't await it) so that in the background it starts factoring the new terms.

katestange commented 3 months ago

Update: I couldn't help myself, I added some info on the non-factored terms. Now we should have the following:

  1. if a term is -1,0,1, we see a dash placeholder
  2. if a term wasn't factorable (too big), we see a question mark
  3. now the factorization info will display as long as we are over the graph in a left-right way, and for dashes or question marks it will give the value of the term and say "(no factorization)" (it used to be the mouse had to be on the graph itself, now it just has to be above the graph in horizontal regard, regardless of vertical positioning)

Some good sequences for testing: A027641, A010060

katestange commented 3 months ago

The text wasn't totally optimal, so I improved the way it displays factorizations

katestange commented 3 months ago

Ok, I really am done doing things to it. I'm done. I'm closing the editor. You can take a look if you like!

katestange commented 3 months ago

When you bump up this.last, you should also call this.seq.fill(this.last) (but don't await it) so that in the background it starts factoring the new terms.

Oh, I just saw this comment. Ok, I won't close the editor. I'll fix now.

katestange commented 3 months ago

Ok, now, I promise, hands off awaiting your feedback! :)

gwhitney commented 3 months ago

Some questions and one recommendation:

  1. Is it a positive or a negative thing that among all of the bars there are tiny places (horizontally) where you can put the mouse such that you are between bars and no term is displayed at the bottom? Would it be better if when your mouse is within the left-to-right region of the bar chart, then always some term of the sequence was selected as the current term to display?
  2. In a similar vein, is it a positive or a negative that one gets these term displays when the mouse cursor is over the title/menu bar above the canvas?
  3. Also, I think because (to be on the safe side) you prepare three or four more "bars" than can fit on the screen, when the parameter tabs are to the right of the canvas (which they are by default), then the term displays continue as you move the mouse horizontally into the parameter tab, about 1/3 of the way across the parameter tab, showing values and factorizations that you can't actually see on the screen, until it mysteriously stops at some point. How do you feel about that behavior?
  4. Isn't there a direct calculation (using division by the bar width) you could do from the mouse position to tell you what sequence index it was on, so that all you need to do as far as detecting mouse overs on prime blocks is check the vertical positions of blocks vs the mouse pointer when you got to that one index, rather than comparing the mouse position to the geometry of every single prime block (where all of the comparisons for a given index that's not in the correct horizontal position are redundant)? Is it worth streamlining the calculation of term display/mouse highlighting in such a way?
  5. In drawTerm, you don't need to recheck that myTerm is not -1, 0, or 1 in the bit that draws the question mark. Just put that part in the "else" clause from the part that checks whether to draw a dash.

As usual, I'll do a final review once I hear back that you've completed whatever you may do in response to these thoughts (or may not do, the only item where I feel strongly there should definitely be a code change is 5).

gwhitney commented 3 months ago

PS would it be better for the text when you mouse over a question mark to be something like " (factorization unknown)" rather than "(no factorization)"?

gwhitney commented 3 months ago

Oops sorry to pile up a few of these: When you mouse over a dash, you see e.g.

S(21) = 0
  = 0

I realize the second "= 0" is to fill the slot of the factorization, but maybe in these basic cases it's not really necessary to repeat in this way, and instead just say S(21) = 0 with no second line?

katestange commented 3 months ago

Ok, some responses implemented: 1) I left the tiny horizontal cracks... they are important aesthetically and I'm not sure it makes any sense for a mouse in that crack to show any particular factorization 2) I now only show factorization when mouse is on canvas 3) ditto 4) I have not done anything on this one. It would be a major refactor and it's complicated and not clear what is best.... I'm doing it the way it is because that way I can do the mouse check exactly when the bar is created, to avoid having to store all the heights somehow. I do see that my originally fairly simple design choice complicated issues as I added functionality, though. So I'm not totally sure how to reimagine this. If you think it's important I will. 5) done 6) now (factorization unknown) 7) no more =0 =0 etc. 8) I aligned the factorization = sign with the original = sign for aesthetic reasons

I'm done responding to your comments, back to you.

katestange commented 3 months ago

It suddenly occurs to me that instead of putting a question mark it would be prettier to put an empty bar (no colouring). Then you still get the term size, just not the filled in info. I'll do that.

katestange commented 3 months ago

Ok, I think the empty factorizations are much prettier now, what do you think?

gwhitney commented 3 months ago

Ok, I think the empty factorizations are much prettier now, what do you think?

Yes, much better. Thanks for a good idea. But the bare thin outlines are a bit hard to see, especially when I zoom out a bit. Maybe a uniform grey or light purple (in the hue of the regular bars) fill? Just a thought.

gwhitney commented 3 months ago

If you try a uniform fill but like it better with them empty, I am OK with it.

katestange commented 3 months ago

Ok, will try that. I also noticed another small bug.

katestange commented 3 months ago

Which do you like? Screenshot from 2024-08-18 19-34-14 Screenshot from 2024-08-18 19-34-04 Screenshot from 2024-08-18 19-33-26

katestange commented 3 months ago

Ok, fixed bug and went with pale non-outlined bars for non-factorization. The reason I liked the empty ones was the clear implication that we don't know what's "in" them, i.e. we don't know the factorization. But the pale bars are prettier. I hope it's still clear the info is missing (not that the bars are all suddenly prime)? I'm worried about whether they are different enough.

gwhitney commented 3 months ago

Ok, fixed bug and went with pale non-outlined bars for non-factorization. The reason I liked the empty ones was the clear implication that we don't know what's "in" them, i.e. we don't know the factorization. But the pale bars are prettier. I hope it's still clear the info is missing (not that the bars are all suddenly prime)? I'm worried about whether they are different enough.

I like the top one best and I think that's what you went with. I think the distinction from the ones with gradient is totally clear -- these look totally drained of life, very blah by comparison. I think it totally works and is prettier than the outline-only. I will do final review and hopefully merge when I can.

gwhitney commented 3 months ago

Do we want to do anything about the numbers/factorizations that are too long to fit on one line in the value/factorization display? E.g., put an ellipsis in the string of digits and indicate how many digits in all? Or split the representation onto multiple lines? Or leave them as they are, just extending off the right of the canvas into nothingness? Or...?

katestange commented 3 months ago

Do we want to do anything about the numbers/factorizations that are too long to fit on one line in the value/factorization display? E.g., put an ellipsis in the string of digits and indicate how many digits in all? Or split the representation onto multiple lines? Or leave them as they are, just extending off the right of the canvas into nothingness? Or...?

Sorry, I was lazy on this count, but you're right, ellipsis looks better. I've done this now. Note: the way I implemented it, it keeps attempting to draw the rest of the factorization's primes after the first ellipsis occurs, but just fails until it runs out of primes. I don't think this is a performance issue, but is it a code style issue? It would be more complicated to add a flag to control flow and stop attempting to write the couple extra primes in rare cases.

gwhitney commented 3 months ago

Very nice! Two comments/questions:

  1. It seems to me that by "it fails until it runs out of primes" you mean "it draws ellipses off screen". I don't see too much point in that: I'd recommend you either do nothing in textCareful if textLeft is already beyond this.sketch.width, or leave the loop in printing factors if textLeft gets bigger than this.sketch.width, whichever seems better to you.
  2. Do you want to or want me to add to the ellipsization so that it shows the number of digits? There are lots of possible ways to do that. One tempting way that should be very recognizable without special knowledge on the part of the viewer is to switch to floating point notation, i.e., write a 1345-digit number starting with 8675309 and so on as 8.675309...e1344 if that is all that fits. Or something like ~8.675309e1344 is slightly more compact, with the ~ supposed to mean that the value is no longer exact, if you think that is understandable. Or if you don't like introducing the decimal point, we could do 8675309...[1345 digits] or 8675309...[+1338] if you think the latter is clear enough (I think it's a bit cryptic myself, but couldn't think of anything clearer that is compact, other than floating point).

I am contemplating item 2 because I could imagine circumstances in which the magnitude is of interest, but the raw ellipses lose that information. Or we can just leave raw ellipses. Your call.

katestange commented 3 months ago
  1. It seems to me that by "it fails until it runs out of primes" you mean "it draws ellipses off screen". I don't see too much point in that: I'd recommend you either do nothing in textCareful if textLeft is already beyond this.sketch.width, or leave the loop in printing factors if textLeft gets bigger than this.sketch.width, whichever seems better to you.

Ok, thanks, that was my question. I'll do that now.

  1. Do you want to or want me to add to the ellipsization so that it shows the number of digits? There are lots of possible ways to do that. One tempting way that should be very recognizable without special knowledge on the part of the viewer is to switch to floating point notation, i.e., write a 1345-digit number starting with 8675309 and so on as 8.675309...e1344 if that is all that fits. Or something like ~8.675309e1344 is slightly more compact, with the ~ supposed to mean that the value is no longer exact, if you think that is understandable. Or if you don't like introducing the decimal point, we could do 8675309...[1345 digits] or 8675309...[+1338] if you think the latter is clear enough (I think it's a bit cryptic myself, but couldn't think of anything clearer that is compact, other than floating point).

Concerns: One concern is that this is doable for the raw value, but seems pretty complicated for the factorization, since each prime is its own number with its own floating point notation, presumably. So that's a whole kettle of fish. Also, even just for the raw value, the approximate size of the raw value is shown on the graph, and this situation of wanting the exact number of digits without the exact value is a getting a bit "niche".

OTOH, the notation 8675309...[1345 digits] seems the most widely understanding/accessible and most aesthetic, so I could be convinced to do that for the raw value and leave the factorization as is. It's slightly finicky because you need to know the number of digits of the number of digits. Maybe I'll try it and see.

katestange commented 3 months ago

Show digits is committed. Take a look.

gwhitney commented 3 months ago

Looks good. I like it. I don't know of any other items, so I will do final review and merge if I don't see anything.

BTW on a small coding note: I advocate against using boolean arguments for two-way behavior switches (as opposed to say a logic function that takes a boolean value). That's because the call doSomething(arg1, arg2, true) is undecipherable for someone reading the code at a call site -- what does that true instruct doSomething to do? So if this textCareful function were part of an API, I would basically insist that the argument be an enum with values something like DIGITS and NONDIGITS. But for a little local function that's only used a couple places, it's OK as is.

gwhitney commented 3 months ago

OK, I found a bug in that the digits of a very long value of a term were being miscounted because the characters of the S(162) = part were being included in the digit count. But as I got to fixing that, I just couldn't resist reducing the number of comparisons of the mouse position to various rectangles. So I gave in to the temptation to just compute the mouseIndex directly from the mouseX coordinate, and then only do the "extract prime" checks in drawing that one stack of bars. Sorry for mucking in your code, I just couldn't shake my aversion to lots of redundant tests.

Anyhow, doing this work left me with two extremely minor questions:

  1. There is a universal convention in the OEIS that in a given entry, the sequence that entry about is represented as a(n). In this visualizer's bottom text, we use S(n) for essentially the identical thing. Do we want to change that to a(n) to match the OEIS convention?
  2. Looking at a sequence that changes signs back and forth and has both 1 and -1 entries and other positive and negative values, it's visually clear that there is a midline to the bar chart and that positive terms are represented above that midline, and negative terms below it -- except for -1, which is represented identically to 1 and above that midline. So that representation creates a bit of visually confusion. So I'd recommend that -1 terms be represented the same thickness as 1 terms, except just below the midline. But that proposal then begs the question of how to represent 0s? Unless you leave a pixel of vertical space between the positive and negative terms, there's no way to put it vertically between the positive and negative terms. So maybe twice the height of the 1s and -1s and centered on that midline? Maybe in grey or maybe dotted? But then it seems weird to have 0 be thicker than 1 or -1. So... your thoughts on this point?

We could of course just leave things as they are, on both/either counts. Let me know and we can proceed accordingly.

katestange commented 3 months ago

I love how detail oriented you are! I actually was unhappy with S( ) but wasn't sure what else to use, but of course you are correct, lowercase a is totally the right thing! As for the bars, I separated the positive and negative terms by one pixel, then used that one pixel for 0 terms, and put 1 and -1 above and below. let me know how you think it looks. I'm on the fence.

katestange commented 3 months ago

Extra thought: I was looking at your code changes, and I realized that maybe now that I have moved all negative terms down one notch, the mouseover detection will be off by one pixel? Well, if you think the new approach looks good enough to keep, then maybe that needs fixing. But I'll wait to hear from you first. Feel free to just fix that and merge if you are happy, or tell me to do it.

gwhitney commented 3 months ago

I don't see that moving the negative bars down a pixel has introduced an error in the prime sensing, because you adjust barStart to do that, and the prime sensing still uses barStart just like you had been doing. So it should all be consistent, and zooming way in it seemed fine behaviorally to me -- does it seem off to you? It can be tricky to know exactly where the "hotspot" of the mouse cursor is...

I like that all of the positive terms now line up, and the negative terms line up, and I like that in a sequence that has positives and zeros the zeros dip just slightly lower. So I am fine with leaving this behavior as you have it now, if you are. Let me know if in the end you are OK with this scheme for representing 1, 0, -1 and if so, I will do final review and hopefully merge.

gwhitney commented 3 months ago

Oops, nope, found a bug, sorry: the ellipsization showing the number of digits is not playing properly with zooming. To reproduce, do FactorFence on A027641, the Bernoulli numerators. Pan over into the forest of tall unfactored bars. Mouse over one of the first ones that causes ellipsization. Now zoom in and out, keeping your mouse pointer on that term. As you zoom out, the type gets smaller so that the number would fit without ellipsization, but it still ellipsizes. And as you zoom in, the type gets bigger but it doesn't ellipsize enough, so it goes back to being just a bunch of numbers running off the edge of the screen.

katestange commented 3 months ago

Thanks for finding the bug! Fixed!

gwhitney commented 3 months ago

Looks great. Here's a very tiny point, that I am not sure it is worth doing anything at all about: Suppose you view a sequence like Thue-Morse so that all you see are trival cases on screen. Then when you hit U/O, nothing seems to happen, This is by design, since after all the log of 1 is 0 and 0 times any scale factor is 0. But the person making the visualization just has the experience of hitting a key, that the legend says will "stretch", and yet nothing appears to happen. Should we do anything about this, and if so, what? Some possibilities:

Or like I said, we could just do nothing as this is a very fine point. Let me know your thoughts.

gwhitney commented 3 months ago

Hmm, just noticed another very odd behavior that I think is a bug?:

View A000045 V-F numbers in the standard view and zoom way out so that you can see the whole sequence up to where we start to lose factorizations. Now slowly scrub your mouse cursor horizontally across the body of the the of the collection of bars, somewhere above the baseline so that you are crossing a lot of bars. What I see when I do this is that from time to time as I move the mouse, the guide text "Click select; ..." flickers to lavender and back to green, whereas I think it should just always stay green. The flickering is a bit distracting and I'm not sure what it means if anything.

katestange commented 3 months ago

I responded to your "tiny thing" by making key presses show visually as a colour change on the key instructions; and I improved the feel of the scaling (which was previously too fast on sequences with giant terms, and could also reverse positive and negative, which seemed silly) by moving from additive to multiplicative scaling.

gwhitney commented 3 months ago

Great, the feedback on keystrokes is nice. And now only the "Click select;" part of the top line of the legend flickers back and forth between green and lavender when I scrub the mouse cursor along a zoomed-out V-F instance of FactorFence. Does that mean something, or is it a bug?

katestange commented 3 months ago

Ok, ready for final review (I hope!)

gwhitney commented 3 months ago

OK, I have completed another round of code/behavior review. Many observations. There is one at the beginning that requires action, marked by an asterisk (*) that it would be nice if you were to do, and a couple at the end that we might want to act on even though this has been a long saga, marked with a question mark (?) -- for those I am happy to do it myself if you agree we should take action, or equally happy with your doing it, and also fine with your saying "Enough already!" on these. The other items are just informational.

katestange commented 3 months ago

Thank you for all the review comments, those are wonderful! And thank you for doing the code refactoring. I had noticed that bit about two ways to make the highlighted prime come first but I have yet to break free of my "if it ain't broke, don't fix it" mentality and didn't refactor. :) Thanks.

  • ? When a term is known to be prime, is it worth it to print its value out twice rather than to just label it as prime? I get the technical logical consistency with showing the factorizations of composites, but is it perhaps more immediately humanly understandable to just put (prime) in parentheses after the (what's now the first) shown value (and get rid of the repetition)? Or if there is a worry about the (prime) going off screen for long primes, we could instead put = a prime or = prime or (prime) on the line below (instead of the repeated value).

I think (prime) is a great suggestion. Happy to do this.

  • ? I think mousewheel zoom is not optimally implemented, and the defect makes it pretty visually confusing for me. I think the usual convention these days in mousewheel zoom is that the current mouse position should be the center of the dilation, so that what your mouse is on stays put and everything expands away from or condenses toward it. I think this would be particularly useful when you are zoomed way out, and you want to inspect some particular bar. Right now, that's tricky, because as soon as you start zooming, the bar of interest moves away from where you were looking, This is particularly confusing when all of the bars in your neighborhood that's showing go off the edges of the screen vertically -- then there is no clue from the "shape" of the bar chart that it has in effect shifted left or right as a result of the zoom. Technically speaking, if we made this change, a zoom would be a pan (possibly with both vertical and horizontal components) together with what's now the zoom. Note I am not quite sure what the up- and down-arrow zoom should do.. perhaps they should be left just as they are now (using the origin as the dilation center)? Or they could check if the mouse pointer happens to be on the canvas and if so use that as the dilation center, otherwise use the origin as the dilation center.

This is a very good point. I think I probably did the easiest thing when I began and never came back to this, as I got used to it. But you're totally right, it should zoom on the mouse pointer.

I think the up/down should continue to use the origin (I suspect if it zooms on the mouse pointer, a user won't realize that and just experience it as random zoom centering).

I'll try to do this, although there's a good chance my head will explode in the process ;)

katestange commented 3 months ago

Ok I took care of * and the two ?'s, although my head did explode. More seriously, I don't know if I did it the "right" way.

katestange commented 3 months ago

Oh wait I forgot the pic again, will do that now.

katestange commented 3 months ago

Ok, back to you!

katestange commented 3 months ago

Hmm, it just occurred to me that when you drag it also selects, so probably we want to avoid that, and only select when you click without dragging. I have to head home now, so if you are looking at this soon/now, you could go ahead on that, otherwise I'll try when I have time again.

gwhitney commented 3 months ago

Awesome. The new mousewheel zoom is great, and the dragging is really nice. I will take care of making it so dragging doesn't highlight the bar you happened to be on.