Cover negative dimen2 in `\texdimenwithunit`

jfbu / texdimens

Utilities and documentation related to TeX dimensional units

1 stars 0 forks source link

Cover negative dimen2 in `\texdimenwithunit` #13

Closed RuixiZhang42 closed 2 years ago

RuixiZhang42 commented 2 years ago

I see no problem with 2 extra lines to cover negative <dimen2>:

\def\texdimenwithunit_#1;#2{%
        \ifnum#1=\p@\texdimendothis\texdimenwithunit_p@\fi
        \ifnum#1>\p@\texdimendothis\texdimenwithunit_A\fi
        \ifnum#1=-\p@\texdimendothis\texdimenwithunit_p@\fi % handles f = -65536
        \ifnum#1<-\p@\texdimendothis\texdimenwithunit_A\fi  % handles f < -65536
        \texdimenorthat\texdimenwithunit_B#2#1;% -65536<f<65536; user input f=0 deserves an error
}%

The truncation and rounding directions should behave the way we want them to (unless I missed something?)

RuixiZhang42 commented 2 years ago

@jfbu I did miss something, sorry! My version fixed the case dimen2<-1pt but not -1pt<dimen2<0pt.

My two extra lines give -1.5 for \texdimenwithunit{3pt}{-2pt}; the current version gave -2.5 which was nonsense. Furthermore, for \texdimenwithunit{-3pt}{-2pt} my version gives 1.5 while the current version gives --2.5.

But testing on \texdimenwithunit{3pt}{-0.5pt} either my version or current version gives -7.99998.

For -65536<f<65536, the macro \texdimenwithunit_Bb doing the Euclidean division need to branch, i.e., need a mirror version (just some first thoughts, not proven):

2T+1 = k*2*f + R with 0<=R<2f,  for 0<f<65536
and
2T-1 = -(2|T|+1) = k*2*f - |R| with 2f<R<=0,  for -65536<f<0

jfbu commented 2 years ago

I really have to understand why issue creations on the repo do not trigger a notification :)

I feel what your say about how \texdimenwithunit_Bb should change is absolutely right but that the simpler no hassle way is simply to move the sign of the "unit" (say dim2) to the dimension dim1 to be "divided." So at start check sign of dim2 and if negative replace dim1 by its opposite.

If I were coding for such a well structured environment as LaTeX3 I would also take this opportunity to raise an expandable error in case of division by zero...

Do you feel it is worthwile to extend the coverage to negative dim2? It is a slight overhead to do so, I have no strong opinion. I can always prepare a PR for discussion anyhow. I will do it.

RuixiZhang42 commented 2 years ago

@jfbu This is because my opinion about a user using \texdimenwithunit{...}{1pt} had changed. This f=65536 case caused you a lot of trouble to filter out. But I realized that a user may be unaware that she used \texdimenwithunit{...}{1pt}. For example, a class designer might have declared \texdimenwithunit{...}{1em/10} because she wanted to base everything in steps of one-tenth of an em. If the body text were set in 10pt, she would unintentionally use the equivalent \texdimenwithunit{...}{1pt}. In the same vein, a negative dim2 might have been the result of other operations, e.g., \texdimenwithunit{...}{\@tempdima-\@tempdimb}, where \@tempdima=10pt and \@tempdimb=5mm; or \texdimenwithunit{...}{\wd0-\wd2}, where \box0 and \box2 holds who knows what content.

Anyhow, based on existing infrastructure, I think I found a solution to cover f<-65536 and f=-65536:

\def\texdimenwithunit_#1;#2{%
        \ifnum#1=\p@  \texdimendothis\texdimenwithunit_p@    \fi
        \ifnum#1=-\p@ \texdimendothis\texdimenwithunit_p@neg \fi % handles f=-65536
        \ifnum#1>\p@  \texdimendothis\texdimenwithunit_A     \fi
        \ifnum#1<-\p@ \texdimendothis\texdimenwithunit_A     \fi % handles f<-65536
        \texdimenorthat\texdimenwithunit_B#2#1;% -65536<f<65536
}%
% add \texdimenwithunit_p@neg, division by -2 instead of 2
\def\texdimenwithunit_p@neg#1#2;#3;{\expandafter
  \texdimenstrippt\the\dimexpr\numexpr#1#3/-2sp\relax}%

jfbu commented 2 years ago

As an aside when using dothis/orthat branching it is advantageous to put first the less likely conditions. So the order would be

        \ifnum#1=-\p@ \texdimendothis\texdimenwithunit_p@neg \fi % handles f=-65536
        \ifnum#1<-\p@ \texdimendothis\texdimenwithunit_A     \fi % handles f<-65536
        \ifnum#1=\p@  \texdimendothis\texdimenwithunit_p@    \fi
        \ifnum#1>\p@  \texdimendothis\texdimenwithunit_A     \fi

I find it a bit complicated to have to also dig into \texdimenwithunit_Bb to cover negative case for -65536<f<0, and have to think it through, although I see it is only one line.

In the meantime I did the {dim1}{-dim2} --> {-dim1}{dim2} way, which however to recycle "as is" existing macros needs to check if dim1 is negative, zero, or positive once dim2<0 is none. (also to avoid extra \numexpr/\dimexpr simply to flip a sign). Done at #14

jfbu commented 2 years ago

This is because my opinion about a user using \texdimenwithunit{...}{1pt} had changed. This f=65536 case caused you a lot of trouble to filter out. But I realized that a user may be unaware that she used

I agree, of course if we had trouble with 1pt it would not be wise to have simply considered it never happens and expect users to filter it out by themselves! Not sure if it is really realistic to expect divisions by negative dimensions to arise and not be somewhere a symptom of a mistake but I don't have strong opinion, and overhead to positive dim2 is acceptable.

jfbu commented 2 years ago

In the same vein, a negative dim2 might have been the result of other operations, e.g., \texdimenwithunit{...}{\@tempdima-\@tempdimb}, where \@tempdima=10pt and \@tempdimb=5mm; or \texdimenwithunit{...}{\wd0-\wd2}, where \box0 and \box2 holds who knows what content.

I see potential headache for the poor maintainers of this repo if they have to explain that \texdimenwithunit isn't really doing a division. \texdimenwithunit{3.5pt}{5.3pt} expands to 0.66039 but other ways to make divisions are possible in eTeX, with different implications:

$ rlwrap etex -jobname test
This is pdfTeX, Version 3.141592653-2.6-1.40.22 (TeX Live 2021) (preloaded format=etex)
 restricted \write18 enabled.
**texdimens
entering extended mode
(./texdimens.tex)
*\let\m\message

*\m{\the\dimexpr\numexpr\dimexpr3.5pt*65536/\dimexpr5.3pt\relax sp\relax}
0.66037pt
*\m{\texdimenwithunit{3.5pt}{5.3pt}}
0.66039
*\m{\the\dimexpr0.66037\dimexpr5.3pt\relax\relax}
3.49995pt
*\m{\the\dimexpr0.66039\dimexpr5.3pt\relax\relax}
3.50003pt
*\m{\the\dimexpr0.66038\dimexpr5.3pt\relax\relax}
3.50003pt

and 3.5/5.3=0.66037735849... so certainly user will consider 0.66037 is bettre than 0.66039 given by the package but we see it does do the job described in the documentation. By luck here the 3.50003 looks closer than the 3.49995 but it didn't even have to (I think, I may need to re-look at the whole thing). But the user may ask wy 0.66039 and not 0.66038? hard to explain... behaviour of \the\dimexpr.... Because in fact the two are the same for TeX:

*\m{\the\numexpr\dimexpr0.66038pt}
43279
*\m{\the\numexpr\dimexpr0.66039pt}
43279

and

*\m{\the\dimexpr43279sp}
0.66039pt

TeX's algorithm makes a choice and it is 0.66039, not 0.66038.

RuixiZhang42 commented 2 years ago

@jfbu Well, there is a simpler example: 0.00001pt and 0.00002pt represent the same dimension (both are exactly 1sp), and using \the TeX will print 0.00002pt. The algorithm used by TeX is explained in full details in A Simple Program Whose Proof Isn’t by Knuth himself. The decimal produced is as short as possible (first priority), then it is as close to N/65536 as possible.

So this is a limitation with TeX’s internals. More specifically, this is a limitation with binary internal fixed-point representations and decimal interface inputs/outputs. Nothing can be done really. Even Knuth himself said in The TeXbook not to expect the fifth decimal number to be what you think it should be.

jfbu commented 2 years ago

@RuixiZhang42 Yes, and thanks for the link. Unaware of it (and I still have to look at it) I had deciphered for myself a some years back the innards of the TeX handling of dimensional inputs, when I got interested, and this is the basis of the explanations in the package doc.

I was not trying to say something could or should be modified, or should have been done otherwise at TeX level, but simply that documentation will always be insufficient. To explain what I mean consider this

$ rlwrap etex -jobname test
This is pdfTeX, Version 3.141592653-2.6-1.40.22 (TeX Live 2021) (preloaded format=etex)
 restricted \write18 enabled.
**texdimens
entering extended mode
(./texdimens.tex)
*\input xintexpr.sty
(lines cut)
*\let\m\message

*\def\test#1#2{\m{\texdimenwithunit{#1pt}{#2pt} versus \texdimenpt{\numexpr\dimexpr#1pt\relax*65536/\dimexpr#2pt\relax sp} versus \xintfloateval{#1/#2}}}

*\test{7.9362}{1.2346}
6.42815 versus 6.42813 versus 6.428154867973433
*\test{3.5}{5.3}
0.66039 versus 0.66037 versus 0.660377358490566
*\test{7.3962}{1.0123}
7.30635 versus 7.30634 versus 7.306332114985676
*\test{1.0123}{7.3962}
0.13687 versus 0.13687 versus 0.1368675806495227

where I did a few examples. Often the first two outputs are the same and even close to the 16digits precision one, so I kept here only a few results, mostly chosen to exhibit "peculiarities".

We know that the \texdimenwithunit{dim1}{dim2} output D guarantees D <dim2>=dim1 if at all possible. But the more naive division (\numexpr\dimexpr#1pt\relax*65536/\dimexpr#2pt\relax), which I think is what is currently done by the macro available in L3 (note: for dim2<1pt we could fix potential overflow problems like done for \texdimenwithunit, perhaps actually the L3 does it, not looked), sometimes gives an output closer to naive expectation of what the quotient should be (the {3.5}{5.3} case) and sometimes it is actually worst ({7.9362}{1.2346}).

This raises the semi-mathematical question: can we quantify which one is expected to be the best on average from naive point of view of the better match to an higher precision result?

And this is precisely the black hole trap I want to avoid going into.

Indeed, if I were to add to the package an interface for the \numexpr\dimexpr#1pt\relax*65536/\dimexpr#2pt\relax alternative (which we know is not good for #2=1ex or #2=1em if we want a D such that Dex or Dem to match dim1), I would have to motivate it. And as is sometimes typical of my errors I would engage into potential horrendous waste of time into studying this and would end up producing in the end some account of my conclusions that nobody on earth will be interested in.

This, basically, is what I wanted to say on this... I am a bit wary of what documenting \texdimenwithunit implies.

RuixiZhang42 commented 2 years ago

@jfbu I think your current explanation “\texdimenwithunit{<dim1>}{<dim2>} outputs a decimal D that guarantees D <dim2>=<dim1> if at all possible” is very clear already. If you wish to further elaborate on this, here is my proposal:

Note: This decimal D does NOT necessarily resemble a floating-point
approximation of <dim1>/<dim2>. To be more precise, let us write
<dim1>=N1 sp and <dim2>=N2 sp, where N1 and N2 are positive integers.
Then D does NOT approximate the quotient N1/N2. In other words, the
number round(D*N2) does NOT necessarily equal to N1.
Instead, the macro \texdimenwithunit guarantees that

    floor( round(D*65536) * N2 / 65536 ) approximates N1.

Keep in mind that for dimension quantities, we are working with TeX's
internal binary fixed-point representation, but the output number D
is in decimal. The macro \texdimenwithunit does not perform division
N1/N2 to get the decimal D, because TeX does not work like that.

jfbu commented 2 years ago

well, I will better stay with my current terse formulation :). For example here the last sentence might need a continuation which would try to say what the macro does... it does indeed do a division at some point: if dim1 and dim2 are D1 pt and D2 pt (if the inputs use other units than the pt it is even more complicated...) then we compute (positive case) Y = round((E1+1/131072)/E2,16) where round(x,16) means rounding to 16 binary places, and E1 and E2 are themselves the rounding to 16 binary places from their input forms as decimal numbers D1 and D2! And then the macro outputs a decimal W which is an approximation to 5 decimal places at most of this Y which was computed in binary as above and had 16 binary places...

If I were to include that, I would have to comment about "why the 1/131072 "... this is beginning to look like the black hole I mentioned.

But this repo exists and is public, so people can still read the developers heated debates! :smile:

RuixiZhang42 commented 2 years ago

@jfbu For those who dare stepping into the black hole:

This raises the semi-mathematical question: can we quantify which one is expected to be the best on average from naive point of view of the better match to an higher precision result?

This question can be rephrased as follows: Let D be the output of \texdimenwithunit{N1 sp}{N2 sp}, then what is the “absolute error” abs( N1/N2 - D ), from a floating-point approximation view? Let N1 run through 1..N2, and we can ask two questions: (1) what is the average “error”? (2) what is the maximum “error”?

I don’t think analytical formulae exist, but the above rephrased problem is easy to program:

(* Mathematica code *)
(* The function tdwithunit[N1,N2] gives a decimal that is as close *)
(* to D as possible, where the fifth decimal place may differ *)
tdwithunit[N1_, N2_] := 
 10^-5 Floor[10^5/65536 Floor[(2 N1 + 1) (32768/N2) + 1/2] + 1/2]

N[Table[
   {
    N2,
    Mean[Table[Abs[N1/N2 - tdwithunit[N1, N2]], {N1, 1, N2}]], (* average error *)
    Max[Table[Abs[N1/N2 - tdwithunit[N1, N2]], {N1, 1, N2}]]   (* maximum error *)
    },
   {N2, 1, 3} (* We look at N2=1sp, 2sp, 3sp *)
   ]] // MatrixForm

which yields

N2 (in sp)  |  Ave err  |  Max err  |
-------------------------------------
1           |  0.5      |  0.5      |
2           |  0.25     |  0.25     |
3           |  0.16667  |  0.16667  |

Playing around with the code for N2=32768 (0.5pt), N2=66342 (1.0123pt), N2=347341 (5.3pt), we get

N2 (in sp)  |  Ave err       |  Max err       |
-----------------------------------------------
 32768      |  0.0000152588  |  0.0000202539  |
 66342      |  0.0000078263  |  0.0000200862  |
347341      |  0.0000044971  |  0.0000140508  |

jfbu commented 2 years ago

Something seems odd, perhaps emulation of \the\dimexpr is not correct formula (I have forgotten what Knuth does exactly). Indeed I can produce larger maximal errors:

Computed with tex and the real \texdimenwithunit. The last number is the first n1 where max is attained.

N2      ave. absolute error maximal absolute error
1       0.5000000000        0.5000000000 (1)
10      0.0500000000        0.0500000000 (1)
87      0.0057471264        0.0057597701 (80)
100     0.0050000000        0.0050000000 (1)
1234    0.0004051864        0.0004181524 (461)
32768   0.0000152588        0.0000228516 (1772)
66342   0.0000078727        0.0000225468 (987)
347341  0.0000045871        0.0000166208 (62203)

confirmation:

$ rlwrap etex -jobname test
This is pdfTeX, Version 3.141592653-2.6-1.40.22 (TeX Live 2021) (preloaded format=etex)
 restricted \write18 enabled.
**texdimens
entering extended mode
(./texdimens.tex)
*\message{\texdimenwithunit{62203sp}{347341sp}}
0.1791
*\input xintexpr.sty
(lines)
*\message{\xintieval{[10] 62203/347341}}
0.1790833792
*\message{\xintieval{[10] 0.1791-0.1790833792}}
0.0000166208
*\message{\texdimenwithunit{987sp}{66342sp}}
0.0149
*\message{\xintieval{[10] 987/66342}}
0.0148774532
*\message{\xintieval{[10] 0.0149-0.0148774532}}
0.0000225468
*\message{\texdimenwithunit{1772sp}{32768sp}}
0.0541
*\message{\xintieval{[10] 1772/32768}}
0.0540771484
*\message{\xintieval{[10] 0.0541-0.0540771484}}
0.0000228516

code source (it takes more than four (!) minutes on a 2,8Ghz ten years old computer)

\input texdimens
\input xintexpr.sty

\def\repx#1{#1#1#1#1#1#1#1#1#1#1}
\def\repc#1{\repx{#1}\repx{#1}\repx{#1}\repx{#1}\repx{#1}%
            \repx{#1}\repx{#1}\repx{#1}\repx{#1}\repx{#1}}
\def\repm#1{\repc{#1}\repc{#1}\repc{#1}\repc{#1}\repc{#1}%
            \repc{#1}\repc{#1}\repc{#1}\repc{#1}\repc{#1}}

\catcode`@ 11
\newdimen\dimi
\newdimen\dimii
\newdimen\onesp \onesp 1sp
\def\getstats#1{%
    \def\maxDelta{0}\def\sumDeltas{0}%
    \dimii#1sp
    \dimi\z@
% proceed by chunks of 1000 reps to avoid memory problems
    \xintReplicate{\xintiieval{#1//1000}}{%
       \repm{%
       \advance\dimi\onesp
       \edef\Dfromtdwu{\texdimenwithunit{\dimi}{\dimii}}%
% \Dfromtdwu\par % to check
% round to 10 decimal places (in fixed point, not floating point sense)
% (then average will be ok also to 10 places)
       \edef\deltaD{\xintiexpr[10] abs(\dimi/\dimii-\Dfromtdwu)\relax}%
       \xintifboolexpr{\deltaD>\maxDelta}
                      {\let\maxDelta\deltaD
                       \edef\nmax{\the\numexpr\dimi}}
                      {}%
% this sum is computed exactly, all numbers have 10 decimal places
       \edef\sumDeltas{\xintexpr\sumDeltas+\deltaD\relax}%
       }%
     }%
% last batch
     \xintReplicate{\xintiieval{#1/:1000}}{%
       \advance\dimi\onesp
       \edef\Dfromtdwu{\texdimenwithunit{\dimi}{\dimii}}%
       \edef\deltaD{\xintiexpr[10] abs(\dimi/\dimii-\Dfromtdwu)\relax}%
       \xintifboolexpr{\deltaD>\maxDelta}
                      {\let\maxDelta\deltaD
                       \edef\nmax{\the\numexpr\dimi}}
                      {}%
       \edef\sumDeltas{\xintexpr\sumDeltas+\deltaD\relax}%
       }%
     \edef\z{#1&\xintieval{[10]\sumDeltas/#1}&\xintthe\maxDelta&(\nmax)}\z
}%

\tabskip10pt

% \globaldefs1 % (see the \edef\z trick above rather)

\halign{#&#&#&\hfil#\cr
N2&ave. absolute error&maximal absolute error&\cr
\getstats{1}\cr
\getstats{10}\cr
\getstats{87}\cr
\getstats{100}\cr
\getstats{1234}\cr
\getstats{32768}\cr  % patience!
\getstats{66342}\cr  % get a coffee!
\getstats{347341}\cr % go for a walk
}

\bye

_{side remark, usually I don't do replicate 1000 the way above, it would be rather like this}

\long\def\ReplicateM#1{\ReplicateC{#1#1#1#1#1#1#1#1#1#1}}%
\long\def\ReplicateC#1{\ReplicateX{#1#1#1#1#1#1#1#1#1#1}}%
\long\def\ReplicateX#1{#1#1#1#1#1#1#1#1#1#1}%

_{but the fun with tex is to always reinvent the wheel (and do worst than last time)}

jfbu commented 2 years ago

For comparison

N2     ave. absolute error maximal absolute error
1      0                   0            (1)
10     0                   0            (1)
87     0.0000047193        0.0000126437 (37)
100    0                   0            (1)
1234   0.0000043657        0.0000129660 (153)
32768  0.0000026396        0.0000076172 (92)
66342  0.0000044520        0.0000150101 (15598)
347341 0.0000044515        0.0000151813 (111467)

with simpler "division"

edited: the \texdimendivide macro was correct for dim1/dim2 with dim1<=dim2 as used here, but would have been wrong is used for dim1>dim2 due to a factor of 2 error somewhere.

\input texdimens
\input xintexpr.sty

\def\repx#1{#1#1#1#1#1#1#1#1#1#1}
\def\repc#1{\repx{#1}\repx{#1}\repx{#1}\repx{#1}\repx{#1}%
            \repx{#1}\repx{#1}\repx{#1}\repx{#1}\repx{#1}}
\def\repm#1{\repc{#1}\repc{#1}\repc{#1}\repc{#1}\repc{#1}%
            \repc{#1}\repc{#1}\repc{#1}\repc{#1}\repc{#1}}

\catcode`@ 11
\catcode`_ 11

% quickly copied and modified from \texdimenwithunit of 0.99d release
% I am doing it ONLY FOR dim1 > 0, dim2>0
\def\texdimendivide#1#2{\expandafter\texdimendivide_i
    % let's not stupidly multiply #1 by 2
    \the\numexpr\dimexpr#2\expandafter;\the\numexpr\dimexpr#1;%
}%
\def\texdimendivide_i#1;{%
        \ifnum#1=\p@\texdimendothis\texdimendivide_p@\fi
        \ifnum#1>\p@\texdimendothis\texdimendivide_A\fi
        \texdimenorthat\texdimendivide_B#1;%
}%
\def\texdimendivide_p@#1;#2;{\texdimenpt{#2sp}}%
% unit>1pt,
\def\texdimendivide_A#1;#2;{\texdimenpt{\numexpr#2*\p@/#1sp}}%
% unit<1pt. ASSUME dim1 POSITIVE ATTENTION
\def\texdimendivide_B#1;#2;{\expandafter\texdimendivide_Bc
    \the\numexpr(2*#2-#1)/(2*#1);#2;#1;%
}%
\def\texdimendivide_Bc#1;#2;#3;{%
     \the\numexpr#1+\expandafter\texdimenstrippt
     \the\dimexpr\numexpr(#2-#1*#3)*\p@/#3sp\relax
}%
\catcode`_ 8

\newdimen\dimi
\newdimen\dimii
\newdimen\onesp \onesp 1sp
\def\getstats#1{%
    \edef\maxDelta{\xintexpr0\relax}\edef\sumDeltas{\xintexpr0\relax}%
    \def\nmax{1}%
    \dimii#1sp
    \dimi\z@
% proceed by chunks of 1000 reps to avoid memory problems
    \xintReplicate{\xintiieval{#1//1000}}{%
       \repm{%
       \advance\dimi\onesp
       \edef\Dfromtdwu{\texdimendivide{\dimi}{\dimii}}%
% \Dfromtdwu\par % to check
% round to 10 decimal places (in fixed point, not floating point sense)
% (then average will be ok also to 10 places)
       \edef\deltaD{\xintiexpr[10] abs(\dimi/\dimii-\Dfromtdwu)\relax}%
       \xintifboolexpr{\deltaD>\maxDelta}
                      {\let\maxDelta\deltaD
                       \edef\nmax{\the\numexpr\dimi}}
                      {}%
% this sum is computed exactly, all numbers have 10 decimal places
       \edef\sumDeltas{\xintexpr\sumDeltas+\deltaD\relax}%
       }%
     }%
% last batch
     \xintReplicate{\xintiieval{#1/:1000}}{%
       \advance\dimi\onesp
       \edef\Dfromtdwu{\texdimendivide{\dimi}{\dimii}}%
       \edef\deltaD{\xintiexpr[10] abs(\dimi/\dimii-\Dfromtdwu)\relax}%
       \xintifboolexpr{\deltaD>\maxDelta}
                      {\let\maxDelta\deltaD
                       \edef\nmax{\the\numexpr\dimi}}
                      {}%
       \edef\sumDeltas{\xintexpr\sumDeltas+\deltaD\relax}%
       }%
     \edef\z{#1&\xintieval{[10]\sumDeltas/#1}&\xintthe\maxDelta&(\nmax)}\z
}%

\tabskip10pt

%\globaldefs1 % (see the \edef\z trick above rather)

\halign{#&#&#&\hfil#\cr
N2&ave. absolute error&maximal absolute error&\cr
\getstats{1}\cr
\getstats{10}\cr
\getstats{87}\cr
\getstats{100}\cr
\getstats{1234}\cr
\getstats{32768}\cr  % patience!
\getstats{66342}\cr  % get a coffee!
\getstats{347341}\cr % go for a walk
}

\bye

jfbu commented 2 years ago

@RuixiZhang42 In merged PR #17 I have re-implemented \texdimenwithunit{dim1}{dim2} for the dim2<1pt branch. This also makes it unnecessary to add another division macro, as I briefly considered in issue #15.

Now we get these modified stats (only with N2<=65536 is the new macro different).

% with 0.999 \texdimenwithunit (changed only N2<=65536)

N2      ave. absolute error maximal absolute error
1       0                   0            (1)
10      0.0000040000        0.0000100000 (2)
87      0.0000076179        0.0000206897 (21)
100     0.0000048000        0.0000100000 (1)
1234    0.0000078935        0.0000200972 (268)
32768   0.0000026396        0.0000076172 (92)

to be compared to formally

N2      ave. absolute error maximal absolute error
1       0.5000000000        0.5000000000 (1)
10      0.0500000000        0.0500000000 (1)
87      0.0057471264        0.0057597701 (80)
100     0.0050000000        0.0050000000 (1)
1234    0.0004051864        0.0004181524 (461)
32768   0.0000152588        0.0000228516 (1772)

RuixiZhang42 commented 2 years ago

@jfbu What an improvement! Congrats! Now \texdimenwithunit{3pt}{0.5pt} produces 6.0, whereas formally it produced 6.00002 (which was technically correct, but ugly). I’m glad that our trip into the “black hole” turned out to be fruitful. I also see in the new implementation you let TeX do the computation and then R <- R+1 if necessary, which looks very similar to \texdimenbothbpmm where you let TeX do the computation and then a <- a-1 if necessary. I’m awed at how things connect to each other.

jfbu commented 2 years ago

@RuixiZhang42 I have added to the repo a file to examine the "deltas" between \texdimenwithunit{N1 sp}{N2 sp} output and the exact fractions N1/N2. This confirms the bias towards positive deltas indicated in my update to the the user documentation. Here is output (after long wait): Capture d’écran 2021-11-07 à 16 20 27

In the table the only non-biased N2 is 32768 (and 1...)

edit: the sceenshot says we limite to N2<=65536 but in fact I did include the >65536 values... clearly I am in the black hole now and have nothing better do do!

(looking at absolute errors is a bit strange as the fractions fill up the [0..1] range but TeX outputs is fixed point anyhow)

N2     ave. non neg.   max non neg. nmax,Ntot      
1      0               0            (1, 1)          
10     0.0000040000    0.0000100000 (2, 10)         
87     0.0000086673    0.0000206897 (21, 74)        
100    0.0000048000    0.0000100000 (1, 100)        
1000   0.0000051200    0.0000200000 (28, 1000)      
1234   0.0000084967    0.0000200972 (268, 1124)     
4321   0.0000085298    0.0000222865 (3458, 3946)    
10000  0.0000051520    0.0000200000 (56, 10000)     
14285  0.0000085290    0.0000225761 (7835, 13048)   
32768  0.0000026393    0.0000076172 (1956, 16416)   
43917  0.0000085265    0.0000226086 (22120, 40126)  
66342  0.0000084620    0.0000225468 (987, 60414)    
347341 0.0000050765    0.0000166208 (62203, 206259)  

N2      ave. non pos.  max non pos.  nmin,Ntot    Nexact    
1       0              0            (1, 1)           (1)   
10      0              0            (1, 6)           (6)   
87     -0.0000015271  -0.0000034483 (69, 14)         (1)   
100     0              0            (1, 52)          (52)  
1000    0              0            (1, 504)         (504) 
1234   -0.0000016989  -0.0000048622 (174, 112)       (2)   
4321   -0.0000018317  -0.0000069428 (2463, 376)      (1)   
10000   0              0            (1, 5008)        (5008)
14285  -0.0000018360  -0.0000073154 (10661, 1240)    (3)   
32768  -0.0000026347  -0.0000076172 (92, 16384)      (32)  
43917  -0.0000018422  -0.0000072159 (23026, 3792)    (1)   
66342  -0.0000018658  -0.0000074734 (32183, 5930)    (2)   
347341 -0.0000038716  -0.0000137418 (285137, 141083) (1)

(I should not have attached the 120K png of the screenshot)

RuixiZhang42 commented 2 years ago

In the table the only non-biased N2 is 32768 (and 1...)

It can be seen from ceil(R*65536/f) that the non-biased N2 are 2**0, 2**1, 2**2, …, 2**15, 2**16. All other N2<=65536 are subject to ceiling so they tend to overestimate.

jfbu commented 2 years ago

I added to the getstats_withunit.tex file a validation, for N2<65536. For those N2<65536 tested we can be sure the \texdimenwithunit output D does verify D <N2sp> = N1 sp for all N1=1..N2.

(I was sure already...)

RuixiZhang42 commented 2 years ago

Wait a sec. The formula ceil(R*65536/f) works also when f>65536 right? So there is no need for (2T+1)*(psi/2) and no need for branching, right?

jfbu commented 2 years ago

@RuixiZhang42

Right indeed, as per https://github.com/jfbu/texdimens/blob/fa42b77b8ed9726ce41a91207abd14c5488382b9/texdimens/texdimens.tex#L29-L30

Both ceil(T/phi) and ceil((T+1)/phi)-1 work and give the sole N with T = trunc(N phi). But at that time I considered computing the ceil was too complicated...

As I had forgotten this opinion, in the phi<1 context I decided today that after all I could compute a ceil if determined enough...

This being said as the result (in sp unit) for f>65536 (phi>1) is unique for those attained dimensions, the criterion is easiness of implementation so I feel the round((T+0.5)/phi) was good.

The ceil() approaches are for dimenup and dimendown challenges. As you remarked earlier there is convergence of methods, so probably if we decided to go the ceil() way we would end up re-discovering the actual implementations of up and down.

jfbu commented 2 years ago

@RuixiZhang42 Deep in the black hole I have outlined an alternative new approach to the core \texdimenUUdown and \texdimenUUup macros at commit 13555b3