jruizgit / rules

Durable Rules Engine
MIT License
1.16k stars 206 forks source link

Integers appear to be compared with 32 bit precision #268

Open mattsha opened 5 years ago

mattsha commented 5 years ago

From python, I was able to assert a fact containing integer values with longer than 32 bit precision, but the values inside the engine are truncated during comparison. Can workaround to a certain extent using floating point values.

Python json dumps/loads preserves precision, so I'd expect the rules engine to avoid truncating without a warning or error.

from durable.lang import *
from durable.engine import MessageNotHandledException
import json

with ruleset('int'):
  @when_all(c.a << +m.val,
            c.b << m.val > c.a.val)
  def greater_than(c):
    print(f'{c.b.val} > {c.a.val}')

  @when_all(c.a << +m.val,
            c.b << m.val == c.a.val)
  def equals(c):
    print(f'{c.a.val} == {c.b.val}')

assert_fact('int', {'val': 1569944950004})
assert_fact('int', {'val': 1569944950005})

with ruleset('float'):
  @when_all(c.a << +m.val,
            c.b << m.val > c.a.val)
  def print_msg(c):
    print(f'{c.b.val} > {c.a.val}')

  @when_all(c.a << +m.val,
            c.b << m.val == c.a.val)
  def equals(c):
    print(f'{c.a.val} == {c.b.val}')

assert_fact('float', {'val': 1569944950004.0})
assert_fact('float', {'val': 1569944950005.0})

print('test json encode/decode')
print(json.loads(json.dumps({'val': 1569944950004})))
print(json.loads(json.dumps({'val': 1569944950005})))

Output:

1569944950005 == 1569944950004 1569944950004 == 1569944950005 1569944950005.0 > 1569944950004.0 test json encode/decode {'val': 1569944950004} {'val': 1569944950005}

jruizgit commented 5 years ago

Thanks for reporting this issue. I will be working on a fix within the next couple of days.

jruizgit commented 5 years ago

This actually works for me. I'm thinking the behavior might depend on the platform. What platform are you using?

mshawver commented 5 years ago

Windows 10 amd64. Ran pip install at a VS 2019 x64 command prompt and also verified that rules.cp37-win_amd64.dll is a x64 dll.

I'm guessing you are using a "long" type, which is 4 bytes on Windows? Maybe "long long" would have more consistent behavior:

https://software.intel.com/en-us/articles/size-of-long-integer-type-on-different-architecture-and-os

You may also want to consider what the behavior should be for python ints greater than 8 bytes length since python int can have arbitrary precision. I think an error might be reasonable in this case if the check is not too expensive. For example (on Linux amd64 this time):

from durable.lang import *

with ruleset('int'):
  @when_all(c.a << +m.val,
            c.b << m.val > c.a.val)
  def greater_than(c):
    print(f'{c.b.val} > {c.a.val}')

  @when_all(c.b << +m.val,
            c.a << m.val == c.b.val)
  def equals(c):
    print(f'{c.a.val} == {c.b.val}')

x = 9223372036854775807
y = 9223372036854775808
print('durable-rules comparison:')
assert_fact('int', {'val': x})
assert_fact('int', {'val': y})

print('python comparison:')
if y>x:
  print(f'{y} > {x}')
elif y==x:
  print(f'{y} == {x}')

Output:

durable-rules comparison: 9223372036854775807 == 9223372036854775808 9223372036854775808 == 9223372036854775807 python comparison: 9223372036854775808 > 9223372036854775807

jruizgit commented 5 years ago

Hi, thanks for providing the info. Yes, the numeric values were being converted to long. I have published a fix for it (using long long instead). Please use version 2.0.10 (published in pypi).

psychemedia commented 4 years ago

I've had a similar issue if I pass in a dict that has a numpy.int64 item along with items of another type.

For example, with the following pandas dataframe:

import pandas as pd
import numpy as np
df=pd.DataFrame({'intval':[1], 'strval':['a']})
df['intval'] = df['intval'].astype(np.int64)
print(df.dtypes)

the following works if I pass in a dict that just contains the int64 item:

from durable.lang import *
with ruleset('_npint_test'):

    @when_all(m.intval >0)
    def testint(c):
        print('works')

#df[['intval']].iloc[0].to_dict() -> {'intval': 1}

post('_npint_test', df[['intval']].iloc[0].to_dict())

---------------------------------------------------------------------------
works

but if I pass in a dict with another object in the dict, it fails:


#df.iloc[0].to_dict() -> {'intval': 1, 'strval': 'a'}
post('_npint_test', df.iloc[0].to_dict())

TypeError: Object of type int64 is not JSON serializable 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-53-ece0fa780699> in <module>
----> 1 post('_npint_test', df.iloc[0].to_dict())

/usr/local/lib/python3.7/site-packages/durable/lang.py in post(ruleset_name, message, complete)
    664 
    665 def post(ruleset_name, message, complete = None):
--> 666     return get_host().post(ruleset_name, message, complete)
    667 
    668 def post_batch(ruleset_name, messages, complete = None):

/usr/local/lib/python3.7/site-packages/durable/engine.py in post(self, ruleset_name, message, complete)
    784 
    785         rules = self.get_ruleset(ruleset_name)
--> 786         return self._handle_function(rules, rules.assert_event, message, complete)
    787 
    788     def post_batch(self, ruleset_name, messages, complete = None):

/usr/local/lib/python3.7/site-packages/durable/engine.py in _handle_function(self, rules, func, args, complete)
    768 
    769         if not complete:
--> 770             rules.do_actions(func(args), callback)
    771             if error[0]:
    772                 raise error[0]

/usr/local/lib/python3.7/site-packages/durable/engine.py in assert_event(self, message)
    327 
    328     def assert_event(self, message):
--> 329         return self._handle_result(durable_rules_engine.assert_event(self._handle, json.dumps(message, ensure_ascii=False)), message)
    330 
    331     def assert_events(self, messages):

/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    236         check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237         separators=separators, default=default, sort_keys=sort_keys,
--> 238         **kw).encode(obj)
    239 
    240 

/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in encode(self, o)
    197         # exceptions aren't as detailed.  The list call should be roughly
    198         # equivalent to the PySequence_Fast that ''.join() would do.
--> 199         chunks = self.iterencode(o, _one_shot=True)
    200         if not isinstance(chunks, (list, tuple)):
    201             chunks = list(chunks)

/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in iterencode(self, o, _one_shot)
    255                 self.key_separator, self.item_separator, self.sort_keys,
    256                 self.skipkeys, _one_shot)
--> 257         return _iterencode(o, 0)
    258 
    259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,

/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in default(self, o)
    177 
    178         """
--> 179         raise TypeError(f'Object of type {o.__class__.__name__} '
    180                         f'is not JSON serializable')
    181 

TypeError: Object of type int64 is not JSON serializable

Following this through, something weird is going on, because it seems that json.dumps() canlt cope with np.int64 at all?

json.dumps( {'int64': np.int64(1)})

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-67-e2b056f3a557> in <module>
----> 1 json.dumps( {'int64': np.int64(1)})

/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    229         cls is None and indent is None and separators is None and
    230         default is None and not sort_keys and not kw):
--> 231         return _default_encoder.encode(obj)
    232     if cls is None:
    233         cls = JSONEncoder

/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in encode(self, o)
    197         # exceptions aren't as detailed.  The list call should be roughly
    198         # equivalent to the PySequence_Fast that ''.join() would do.
--> 199         chunks = self.iterencode(o, _one_shot=True)
    200         if not isinstance(chunks, (list, tuple)):
    201             chunks = list(chunks)

/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in iterencode(self, o, _one_shot)
    255                 self.key_separator, self.item_separator, self.sort_keys,
    256                 self.skipkeys, _one_shot)
--> 257         return _iterencode(o, 0)
    258 
    259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,

/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py in default(self, o)
    177 
    178         """
--> 179         raise TypeError(f'Object of type {o.__class__.__name__} '
    180                         f'is not JSON serializable')
    181 

TypeError: Object of type int64 is not JSON serializable

Related numpy issue: https://github.com/numpy/numpy/issues/12481