libtom / libtommath

LibTomMath is a free open source portable number theoretic multiple-precision integer library written entirely in C.
https://www.libtom.net
Other
656 stars 194 forks source link

mp_complement #161

Closed czurnieden closed 5 years ago

czurnieden commented 5 years ago

Waht exactly does mp_complement do? The comment says b = ~a, so it is supposed to be a not function? not like in and, or, xor? Has it something to do with the new two's complement functions? Please clarify, so it can be described in the correct manner in the documentation.

minad commented 5 years ago

yes, it's a bitwise not.

sjaeckel commented 5 years ago

AFAIU it's the one's complement not only a bitwise not

sjaeckel commented 5 years ago

s/one's/two's/

sjaeckel commented 5 years ago

Ah I overlooked the - 1 operation, yeah then it should be a binary not ... I'll stop browsing code on my mobile ...

minad commented 5 years ago

:)

czurnieden commented 5 years ago

The algorithm ~x = (-x) - 1 does not work in libtommath with its sign-magnitude numbers (sort of, the magnitude is arbitrary here), it only works in two's complement with fixed size.

(Examples shamelessly stolen from Wikpedia)

Binary NOT (also known as "complement"):

 a = 1010 1011 (171_10)
~a = 0101 0100 (84_10)

Two's complement

 2 = 1111 1110
-2 = 0000 0010 

Binary negation (with the normal two's complement representation)

 2 = 0000 0010
-2 = 1111 1110 

That all is different in LTM:

a =  2 = 0000 0010 and the variable "a.sign" is set to MP_ZPOS
a = -2 = 0000 0010 and the variable "a.sign" is set to MP_NEG
-(a) -1 = -(-2) - 1 =   0000 0001 and the variable "a.sign" is set to MP_ZPOS

But a normal binary complement is indeed missing in LTM. I mean something like

/* limbwise, so b <- ~a =>  ~b == a  if a->used == b->used*/
int mp_not(mp_int *a, mp_int *b)
{
   int i, res = MP_OKAY;

   if (USED(a) <= 1) {
      if (a->alloc == 0) {
         if ((res = mp_grow(a, 2)) != MP_OKAY) {
            return res;
         }
         USED(a) = 1;
      }
      DIGIT(b,0) = ~(DIGIT(a,0)) & MP_MASK;
      USED(b) = USED(a);
      return res;
   }

   if ((res = mp_grow(b, USED(a))) != MP_OKAY) {
      return res;
   }

   for (i = 0; i < USED(a); i++) {
      DIGIT(b,i) = ~(DIGIT(a,i)) & MP_MASK;
   }
   USED(b) = USED(a);
   /*
    * No mp_clamp(b) to keep the size of b constant.
    * This might not be what the user wants, document!
    */
   return res;
}

And just mask off what you need. Or did I misunderstand something?

minad commented 5 years ago

@czurnieden Are you sure? I didn't think much about it but tested it against GMP mpz_com.

czurnieden commented 5 years ago

@minad

Are you sure?

No, I am not. I wouldn't have asked, if I was sure ;-)

but tested it against GMP mpz_com.

Aaaah, therein lies the problem. If I may quote from gmp-6.1.2/mpz/com.c:

          /* Store a negative size, to indicate ones-extension.  */
          SIZ (dst) = -size;

They set an extra flag (or abuse the "size" variable if you want ) to indicate one's complement. Do you want to do the same? It is quite a lot of work, I can tell you.

minad commented 5 years ago

@czurnieden In GMP they are using the size field to store the sign. A negative size field for negative numbers. But the representation is still sign-magnitude.

czurnieden commented 5 years ago

@minad

In GMP they are using the size field to store the sign.

Yes, but that's not abuse it's just economical (you need only one bit for the sign, why waste more?). They really use sign as an indicator!

But you don't have to believe me, of course, we shall test it.

#include <stdlib.h>
#include <stdio.h>
#include <gmp.h>
#include "/home/czurnieden/GITHUB/libtommath/tommath.h"

/* Shamelessly stolen from
  https://stackoverflow.com/questions/111928/is-there-a-printf-converter-to-print-in-binary-format */
static void printBits(size_t const size, void const *const ptr)
{
   unsigned char *b = (unsigned char *) ptr;
   unsigned char byte;
   int i;
   int j;

   for (i = (int) size - 1; i >= 0; i--) {
      for (j = 7; j >= 0; j--) {
         byte = (b[i] >> j) & 1;
         printf("%u", byte);
      }
   }
   puts("");
}
int main(void) {
   mpz_t x, y, t1, t2, t3, r1 , r2;
   mp_int lx, ly, lt1, lt2, lt3, lr1 , lr2;
   unsigned int ux, uy, ut1, ut2, ut3, ur1, ur2;
   int ix, iy;

   mpz_init(x);
   mpz_init(y);
   mpz_init(t1);
   mpz_init(t2);
   mpz_init(t3);
   mpz_init(r1);
   mpz_init(r2);

   mp_init_multi(&lx, &ly, &lt1, &lt2, &lt3, &lr1 , &lr2 , NULL);

   /*
   for (int i=257;i>0;i--) {
      mpz_set_si(x, -i);
      printf("%d   y =%9.9s\n",-i,mpz_get_str (NULL, 2, x));
      mpz_com(x,x);
      printf("%d  ~y =%9.9s\n",-i,mpz_get_str (NULL, 2, x));
      mpz_com(x,x);
      printf("%d ~~y =%9.9s\n",-i,mpz_get_str (NULL, 2, x));
      puts("");
   }
   puts("");
   for (int i=0;i<257;i++) {
      mpz_set_ui(x, i);
      printf("%d   x =%9.9s\n",i,mpz_get_str (NULL, 2, x));
      mpz_com(x,x);
      printf("%d  ~x =%9.9s\n",i,mpz_get_str (NULL, 2, x));
      mpz_com(x,x);
      printf("%d ~~x =%9.9s\n",i,mpz_get_str (NULL, 2, x));
      puts("");
   }
   */

   /*
      Randomly choosen test from gmp-6.1.2/tests/mpz/logic.c

      mpz_com (t1, x);
      MPZ_CHECK_FORMAT (t1);
      mpz_com (t2, y);
      MPZ_CHECK_FORMAT (t2);
      mpz_and (t3, t1, t2);
      MPZ_CHECK_FORMAT (t3);
      mpz_com (r1, t3);
      MPZ_CHECK_FORMAT (r1);
      mpz_ior (r2, x, y);
      MPZ_CHECK_FORMAT (r2);
      if (mpz_cmp (r1, r2) != 0)
        dump_abort ();

      So the last two results must be equal: ~(~x & ~y) == x | y
   */

   puts("GMP's method");
   mpz_set_ui(x, 5);
   mpz_set_ui(y, 6);
   puts("x = 5, y = 6");
   printf("%s   = x          = %5.5s\n",mpz_get_str (NULL, 10, x) ,mpz_get_str (NULL, 2, x));
   printf("%s   = y          = %5.5s\n",mpz_get_str (NULL, 10, y) ,mpz_get_str (NULL, 2, y));
   mpz_com (t1, x);
   printf("%s  = ~x         = %5.5s\n",mpz_get_str (NULL, 10, t1) ,mpz_get_str (NULL, 2, t1));
   mpz_com (t2, y);
   printf("%s  = ~y         = %5.5s\n",mpz_get_str (NULL, 10, t2) ,mpz_get_str (NULL, 2, t2));
   mpz_and (t3, t1, t2);
   printf("%s  = ~x & ~y    = %5.5s\n",mpz_get_str (NULL, 10, t3) ,mpz_get_str (NULL, 2, t3));
   mpz_com (r1, t3);
   printf("%s   = ~(~x & ~y) = %5.5s\n",mpz_get_str (NULL, 10, r1) ,mpz_get_str (NULL, 2, r1));
   mpz_ior (r2, x, y);
   printf("%s   = x | y      = %5.5s\n",mpz_get_str (NULL, 10, r2) ,mpz_get_str (NULL, 2, r2));

   puts("\nWith native two's complement");
   ux = 5;
   uy = 6;
   printf("%u   = ux                   = ", ux); printBits(sizeof(int),&ux);
   printf("%u   = uy                   = ", uy); printBits(sizeof(int),&uy);
   ix = -5;
   iy = -6;
   printf("%d   = ix                  = ", ix); printBits(sizeof(int),&ix);
   printf("%d   = iy                  = ", iy); printBits(sizeof(int),&iy);
   ut1 = ~ux;
   printf("\n%u  = ~ux          = ", ut1); printBits(sizeof(int),&ut1);
   ut2 = ~uy;
   printf("%u  = ~uy          = ", ut2); printBits(sizeof(int),&ut2);
   ut3 = ut1 & ut2;
   printf("%u  = ~ux & ~uy    = ", ut3); printBits(sizeof(int),&ut3);
   ur1 = ~ut3;
   printf("%u           = ~(~ux & ~uy) = ", ur1); printBits(sizeof(int),&ur1);
   ur2 = ux | uy;
   printf("%u           = ux | uy      = ", ur1); printBits(sizeof(int),&ur1);

   puts("\nLTM's method");
   mp_set(&lx, 5);
   mp_set(&ly, 6);
   printf("5   = lx           =  "); mp_fwrite (&lx, 2, stdout);
   printf("\n5   = ly           =  ");mp_fwrite (&ly, 2, stdout);
   mp_complement (&lx, &lt1);
   printf("\n      ~lx          = ");mp_fwrite (&lt1, 2, stdout);
   mp_complement (&ly, &lt2);
   printf("\n      ~ly          = ");mp_fwrite (&lt2, 2, stdout);
   mp_and (&lt1, &lt2, &lt3);
   printf("\n      ~lx & ~ly    = ");mp_fwrite (&lt3, 2, stdout);
   mp_complement (&lt3, &lr1);
   printf("\n      ~(~lx & ~ly) =  ");mp_fwrite (&lr1, 2, stdout);
   mp_or (&lx, &ly, &lr2);
   printf("\n      lx | ly      =  ");mp_fwrite (&lr2, 2, stdout);
   putchar('\n');

   /* this is a bit unnecessary */
   mp_clear_multi(&lx, &ly, &lt1, &lt2, &lt3, &lr1 , &lr2 , NULL);

   mpz_clear(x);
   mpz_clear(y);
   mpz_clear(t1);
   mpz_clear(t2);
   mpz_clear(t3);
   mpz_clear(r1);
   mpz_clear(r2);

   exit(EXIT_SUCCESS);
}

That prints:

5   = x          =   101
6   = y          =   110
-6  = ~x         =  -110
-7  = ~y         =  -111
-8  = ~x & ~y    = -1000
7   = ~(~x & ~y) =   111
7   = x | y      =   111

With native two's complement
5   = ux                   = 00000000000000000000000000000101
6   = uy                   = 00000000000000000000000000000110
-5   = ix                  = 11111111111111111111111111111011
-6   = iy                  = 11111111111111111111111111111010

4294967290  = ~ux          = 11111111111111111111111111111010
4294967289  = ~uy          = 11111111111111111111111111111001
4294967288  = ~ux & ~uy    = 11111111111111111111111111111000
7           = ~(~ux & ~uy) = 00000000000000000000000000000111
7           = ux | uy      = 00000000000000000000000000000111

LTM's (your) method
5   = lx           =  101
5   = ly           =  110
      ~lx          = -110
      ~ly          = -111
      ~lx & ~ly    = -110
      ~(~lx & ~ly) =  101
      lx | ly      =  111

The last two results with your implementation of mpz_com are not equal but they must be equal because ~(~x & ~y) = x | y according to De Morgan's laws.

minad commented 5 years ago

You have to use tc_and etc. Maybe rename complement to tc_complement to make things more clear?