libtom / libtommath

LibTomMath is a free open source portable number theoretic multiple-precision integer library written entirely in C.
https://www.libtom.net
Other
656 stars 194 forks source link

mp_fwrite should not write \0 character #543

Closed J08nY closed 1 year ago

J08nY commented 1 year ago

Currently, mp_fwrite writes a terminating \0 character to the stream it is given. This does not make much sense, as usually one uses this fwrite to print something to the stdout or to a text file where no zero bytes should appear.

This is due to the mp_to_radix function always appending the NULL byte and the subsequent write in fwrite taking it all. An easy solution would be to trim the NULL byte during the fwrite call:

https://github.com/libtom/libtommath/blob/develop/mp_fwrite.c#L23

   if ((err = mp_to_radix(a, buf, size, &written, radix)) == MP_OKAY) {
      if (fwrite(buf, written - 1, 1uL, stream) != 1uL) {
         err = MP_ERR;
      }
   }

instead of

   if ((err = mp_to_radix(a, buf, size, &written, radix)) == MP_OKAY) {
      if (fwrite(buf, written, 1uL, stream) != 1uL) {
         err = MP_ERR;
      }
   }
czurnieden commented 1 year ago

I did a PR (#542) a couple of days ago (doing basically the same). It is mostly just annoying but if you write to a file and parse that file with anoterh program, say R, that program might choke on that null-byte.

$ ./mr-timings   > millionprimeisprime
64-bit is_prime all:         0 h 10 min  2 sec 730 ms 885 usec 113 nsec
64-bit average:              0 h  0 min  0 sec   0 ms 602 usec 730 nsec
$ wc -l  millionprimeisprime
1000000 millionprimeisprime
$ sort -un millionprimeisprime | wc -l
1000000
$  Rscript -e 'options(digits=16);input <- read.table(file("stdin", "r"), header=F, col.names = c("data"),, sep = "\n");summary(input);data1 <- input$data ;sd(data1)' < millionprimeisprime
Warning messages:
1: In read.table(file("stdin", "r"), header = F, col.names = c("data"),  :
  line 1 appears to contain embedded nulls
2: In read.table(file("stdin", "r"), header = F, col.names = c("data"),  :
  line 2 appears to contain embedded nulls
3: In read.table(file("stdin", "r"), header = F, col.names = c("data"),  :
  line 3 appears to contain embedded nulls
4: In read.table(file("stdin", "r"), header = F, col.names = c("data"),  :
  line 4 appears to contain embedded nulls
5: In read.table(file("stdin", "r"), header = F, col.names = c("data"),  :
  line 5 appears to contain embedded nulls
6: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
  embedded nul(s) found in input
      data                  
 Min.   :8.10902965532e+12  
 1st Qu.:4.61638874678e+18  
 Median :9.22246895331e+18  
 Mean   :9.22294697717e+18  
 3rd Qu.:1.38371901260e+19  
 Max.   :1.84467227037e+19  
[1] 5321654314370777088

Did it have a reason once, does it still or was it just something nobody cared for?

sjaeckel commented 1 year ago

Did it have a reason once, does it still or was it just something nobody cared for?

I searched through the history and IIUC it was introduced in 2019 (71d1b7b9d822f8728526064834d31df57a39460f) and it's an oversight nobody noticed until now.

czurnieden commented 1 year ago

I searched through the history and IIUC it was introduced in 2019 (https://github.com/libtom/libtommath/commit/71d1b7b9d822f8728526064834d31df57a39460f) and it's an oversight nobody noticed until now.

And I would have sworn that it was intentional! Mandela effect? ;-)