adriank / ObjectPath

The agile query language for semi-structured data
http://objectpath.org
MIT License
380 stars 93 forks source link

type check should use `isinstance` instead of `type() is` #91

Closed truebit closed 4 years ago

truebit commented 4 years ago

I saw all the type check in this libary using type(xxx) is/is not. I suggest using another built-in function: isinstance.

The main difference here is that isinstance also returns True for subclasses. More details could be found here: What are the differences between type() and isinstance()?

I got a subclass derived from dict, objectpath would raise an error. change interpreter.py line 44 using isinstance would fix the bug.

  def setData(self, obj):
    if isinstance(obj, tuple(ITER_TYPES + [dict])):
      self.data = obj
adriank commented 4 years ago

Hi,

Last time I checked, isinstance was making ObjectPath significantly slower. If you do performance tests and the difference isn't so drastic nowadays, I'll accept your PR.

Best, Adrian

truebit commented 4 years ago

Oh, I did not realize the performance issue. If so, I think we should narrow down the issue to setData method itself. If we agree to support subclasses in objectpath, we could change this method using isinstance.

truebit commented 4 years ago

concerning about the performance, here below is my testing results:

environment: python2 version: 2.7.16 python3 version : 3.7.4 both installed via brew Mac OS X version: 10.14.6 (18G103)

result: all in 10**6 loop of 3 times with time unit microseconds (usec)

# python2 with type/isinstance diff
python2 -m timeit -n 1000000 'type("A") in (list,basestring)'
1000000 loops, best of 3: 0.143 usec per loop

python2 -m timeit -n 1000000 'isinstance("A", (list,str))'
1000000 loops, best of 3: 0.234 usec per loop

# python3 with type/isinstance diff
python3 -m timeit -r 3 -n 1000000 -u usec  'type("A") in (list,str)'
1000000 loops, best of 3: 0.179 usec per loop

python3 -m timeit -r 3 -n 1000000 -u usec  'isinstance("A", (list,str))'
1000000 loops, best of 3: 0.188 usec per loop

# python2/3 multiple type check. isinstance only support tuple
python3 -m timeit -r 3 -n 1000000 -u usec  'type("A") in [list,str]'
1000000 loops, best of 3: 0.208 usec per loop

python2 -m timeit -n 1000000 'type("A") in [list,str]'
1000000 loops, best of 3: 0.212 usec per loop

# python2/3 multiple type check using + in list
python2 -m timeit -n 1000000 'type("A") in [list,str]+[int]'
1000000 loops, best of 3: 0.52 usec per loop
python3 -m timeit -r 3 -n 1000000 -u usec  'type("A") in [list,str]+[int]'
1000000 loops, best of 3: 0.34 usec per loop

If microseconds level performance is a problem, I think we should also consider using tuple instead of list or even plus to do type check

adriank commented 4 years ago

Ok, I merged PR. We'll see.

Greetings, Adrian Kalbarczyk

https://kalbarczyk.co | https://devyard.io

On Thu, Oct 17, 2019 at 8:55 AM Xiao Wang notifications@github.com wrote:

concerning about the performance, here below is my testing results:

environment: python2 version: 2.7.16 python3 version : 3.7.4 both installed via brew Mac OS X version: 10.14.6 (18G103)

result: all in 10**6 loop with time unit microseconds (usec)

python2 with type/isinstance diff

python2 -m timeit -n 1000000 'type("A") in (list,basestring)' 1000000 loops, best of 3: 0.143 usec per loop

python2 -m timeit -n 1000000 'isinstance("A", (list,str))' 1000000 loops, best of 3: 0.234 usec per loop

python3 with type/isinstance diff

python3 -m timeit -n 1000000 -u usec 'type("A") in (list,str)' 1000000 loops, best of 5: 0.179 usec per loop

python3 -m timeit -n 1000000 -u usec 'isinstance("A", (list,str))' 1000000 loops, best of 5: 0.188 usec per loop

python2/3 mutliple type check. isinstance only support tuple

python3 -m timeit -n 1000000 -u usec 'type("A") in [list,str]' 1000000 loops, best of 5: 0.208 usec per loop

python2 -m timeit -n 1000000 'type("A") in [list,str]' 1000000 loops, best of 3: 0.212 usec per loop

If microseconds performance diff is a problem, I think we should also consider using tuple instead of list to do type check

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/adriank/ObjectPath/issues/91?email_source=notifications&email_token=AABLE4VOCHOFHROT7ZL2LXTQPAD7LA5CNFSM4JBG3QM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBPAW4I#issuecomment-543034225, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABLE4TIJNJTBT47IUYKY6DQPAD7LANCNFSM4JBG3QMQ .

adriank commented 4 years ago

Ok, but you did not run tests before submitting PR. Tests fail.

Greetings, Adrian Kalbarczyk

https://kalbarczyk.co | https://devyard.io

On Thu, Oct 17, 2019 at 10:08 PM Adrian Kalbarczyk < adrian.kalbarczyk@gmail.com> wrote:

Ok, I merged PR. We'll see.

Greetings, Adrian Kalbarczyk

https://kalbarczyk.co | https://devyard.io

On Thu, Oct 17, 2019 at 8:55 AM Xiao Wang notifications@github.com wrote:

concerning about the performance, here below is my testing results:

environment: python2 version: 2.7.16 python3 version : 3.7.4 both installed via brew Mac OS X version: 10.14.6 (18G103)

result: all in 10**6 loop with time unit microseconds (usec)

python2 with type/isinstance diff

python2 -m timeit -n 1000000 'type("A") in (list,basestring)' 1000000 loops, best of 3: 0.143 usec per loop

python2 -m timeit -n 1000000 'isinstance("A", (list,str))' 1000000 loops, best of 3: 0.234 usec per loop

python3 with type/isinstance diff

python3 -m timeit -n 1000000 -u usec 'type("A") in (list,str)' 1000000 loops, best of 5: 0.179 usec per loop

python3 -m timeit -n 1000000 -u usec 'isinstance("A", (list,str))' 1000000 loops, best of 5: 0.188 usec per loop

python2/3 mutliple type check. isinstance only support tuple

python3 -m timeit -n 1000000 -u usec 'type("A") in [list,str]' 1000000 loops, best of 5: 0.208 usec per loop

python2 -m timeit -n 1000000 'type("A") in [list,str]' 1000000 loops, best of 3: 0.212 usec per loop

If microseconds performance diff is a problem, I think we should also consider using tuple instead of list to do type check

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/adriank/ObjectPath/issues/91?email_source=notifications&email_token=AABLE4VOCHOFHROT7ZL2LXTQPAD7LA5CNFSM4JBG3QM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBPAW4I#issuecomment-543034225, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABLE4TIJNJTBT47IUYKY6DQPAD7LANCNFSM4JBG3QMQ .

truebit commented 4 years ago

I tried on my local machine, all tests passed. Could you please list which tests failed?

adriank commented 4 years ago

https://travis-ci.org/adriank/ObjectPath/builds/599331117?utm_medium=notification&utm_source=email

On Fri, 18 Oct 2019 at 04:32 Xiao Wang notifications@github.com wrote:

I tried on my local machine, all tests passed. Could you please list which tests failed?

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/adriank/ObjectPath/issues/91?email_source=notifications&email_token=AABLE4WAYTLDGOX7HBWROULQPEN5HA5CNFSM4JBG3QM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBSIIQQ#issuecomment-543458370, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABLE4QZXJDGS7GW44F3F7TQPEN5HANCNFSM4JBG3QMQ .

-- Greetings, Adrian Kalbarczyk

https://kalbarczyk.co | https://devyard.io

truebit commented 4 years ago

I saw it’s Travis-ci environment problem:

Downloading archive: https://storage.googleapis.com/travis-ci-language-archives/python/binaries/ubuntu/16.04/x86_64/python-3.3.tar.bz2 1620.12s$ curl -sSf -o python-3.3.tar.bz2 ${archive_url} 163curl: (22) The requested URL returned error: 404 Not Found 164Unable to download 3.3 archive. The archive may not exist. Please consider a different version.

Adrian Kalbarczyk notifications@github.com于2019年10月18日 周五下午5:25写道:

https://travis-ci.org/adriank/ObjectPath/builds/599331117?utm_medium=notification&utm_source=email

On Fri, 18 Oct 2019 at 04:32 Xiao Wang notifications@github.com wrote:

I tried on my local machine, all tests passed. Could you please list which tests failed?

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub < https://github.com/adriank/ObjectPath/issues/91?email_source=notifications&email_token=AABLE4WAYTLDGOX7HBWROULQPEN5HA5CNFSM4JBG3QM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBSIIQQ#issuecomment-543458370 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AABLE4QZXJDGS7GW44F3F7TQPEN5HANCNFSM4JBG3QMQ

.

-- Greetings, Adrian Kalbarczyk

https://kalbarczyk.co | https://devyard.io

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/adriank/ObjectPath/issues/91?email_source=notifications&email_token=AAFYJWRJPUHL44WDGFKYCKDQPF6G7A5CNFSM4JBG3QM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBTRNPA#issuecomment-543626940, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFYJWVMAVAG542K6IXJSSTQPF6G7ANCNFSM4JBG3QMQ .

-- Br,

Sean Wang Blog: fclef.wordpress.com http://fclef.wordpress.com/about

adriank commented 4 years ago

TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types 251 252

I have the very same error on my Mac, using Python 2.7.

Greetings, Adrian Kalbarczyk

https://kalbarczyk.co | https://devyard.io

On Fri, Oct 18, 2019 at 12:08 PM Xiao Wang notifications@github.com wrote:

I saw it’s Travis-ci environment problem:

Downloading archive:

https://storage.googleapis.com/travis-ci-language-archives/python/binaries/ubuntu/16.04/x86_64/python-3.3.tar.bz2 1620.12s$ curl -sSf -o python-3.3.tar.bz2 ${archive_url} 163curl: (22) The requested URL returned error: 404 Not Found 164Unable to download 3.3 archive. The archive may not exist. Please consider a different version.

Adrian Kalbarczyk notifications@github.com于2019年10月18日 周五下午5:25写道:

https://travis-ci.org/adriank/ObjectPath/builds/599331117?utm_medium=notification&utm_source=email

On Fri, 18 Oct 2019 at 04:32 Xiao Wang notifications@github.com wrote:

I tried on my local machine, all tests passed. Could you please list which tests failed?

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub <

https://github.com/adriank/ObjectPath/issues/91?email_source=notifications&email_token=AABLE4WAYTLDGOX7HBWROULQPEN5HA5CNFSM4JBG3QM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBSIIQQ#issuecomment-543458370

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AABLE4QZXJDGS7GW44F3F7TQPEN5HANCNFSM4JBG3QMQ

.

-- Greetings, Adrian Kalbarczyk

https://kalbarczyk.co | https://devyard.io

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/adriank/ObjectPath/issues/91?email_source=notifications&email_token=AAFYJWRJPUHL44WDGFKYCKDQPF6G7A5CNFSM4JBG3QM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBTRNPA#issuecomment-543626940 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAFYJWVMAVAG542K6IXJSSTQPF6G7ANCNFSM4JBG3QMQ

.

-- Br,

Sean Wang Blog: fclef.wordpress.com http://fclef.wordpress.com/about

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/adriank/ObjectPath/issues/91?email_source=notifications&email_token=AABLE4V7MXEOTHGP36QSO5DQPGDKVA5CNFSM4JBG3QM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBTXD4A#issuecomment-543650288, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABLE4RG7NSAMZCKBZGQZFLQPGDKVANCNFSM4JBG3QMQ .

truebit commented 4 years ago

I saw it throws TypeError: isinstance() arg 2 must be a class, type, or tuple of classes and types on lineif isinstance(obj, tuple(ITER_TYPES + [dict])).

It's due to the python 2 handling process on map and filter. Actually something like type(map(string.lower, 'ABC')) is map is alsoFalse in python2, but covered by the in list check.

The NameError would only raise when it's python 1 for map and filter. What's your opinion to fix this problem? Using Python version check maybe a good solution.

truebit commented 4 years ago

I created a PR #93 to fix this failure.