josemiotto / pylevy

Levy distributions for Python
GNU General Public License v3.0
34 stars 18 forks source link

levy on Python 3 #1

Closed lukast008 closed 7 years ago

lukast008 commented 8 years ago

Hello, thanks for this lib :)

I'm trying to run this code using Python 3.4, but I'm getting errors with base64 encoding. Could you check this: Traceback (most recent call last): File "C:\Python34\lib\base64.py", line 519, in _input_type_check m = memoryview(s) TypeError: memoryview: str object does not have the buffer interface

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "<pyshell#23>", line 1, in levy.fit_levy(x) File "C:\Python34\lib\site-packages\levy.py", line 369, in fit_levy alpha, beta, mu, sigma = parameters.get_all() File "C:\Python34\lib\site-packages\scipy\optimize\optimize.py", line 377, in fmin res = _minimize_neldermead(func, x0, args, callback=callback, *_opts) File "C:\Python34\lib\site-packages\scipy\optimize\optimize.py", line 435, in _minimize_neldermead fsim[0] = func(x0) File "C:\Python34\lib\site-packages\scipy\optimize\optimize.py", line 285, in functionwrapper return function((wrapper_args + args)) File "C:\Python34\lib\site-packages\levy.py", line 367, in neglog_density return np.sum(neglog_levy(x, alpha, beta, mu, sigma)) File "C:\Python34\lib\site-packages\levy.py", line 315, in neglog_levy return -np.log(np.maximum(1e-100, levy(x, alpha, beta, mu, sigma, par=par))) File "C:\Python34\lib\site-packages\levy.py", line 270, in levy import levy_data File "C:\Python34\lib\site-packages\levy_data.py", line 71588, in """)) File "C:\Python34\lib\base64.py", line 561, in decodestring return decodebytes(s) File "C:\Python34\lib\base64.py", line 553, in decodebytes _input_type_check(s) File "C:\Python34\lib\base64.py", line 522, in _input_type_check raise TypeError(msg) from err TypeError: expected bytes-like object, not str

josemiotto commented 8 years ago

Hi, thanks for reporting this error. I did everything for Python 2, so I have to take a deeper look at it and prepare a version for python 3. In any case, from a quick look I see that in python 3 the argument of decode is a byte type, not a string, so I think you have to replace in the _make_data_file function decodestring by b64decode and encodestring by b64encode, in particular in

file = open("levy_data.py", "wt")
file.write("""

This is a generated file, do not edit.

import numpy, base64

pdf = numpy.loads(base64.decodestring( \"\"\"%s\"\"\"))\n cdf = numpy.loads(base64.decodestring( \"\"\"%s\"\"\"))\n""" % (base64.encodestring(pdf.dumps()), base64.encodestring (cdf.dumps())))

with

file = open("levy_data.py", "wt")
file.write("""

This is a generated file, do not edit.

import numpy, base64

pdf = numpy.loads(base64.b64decode( \"\"\"%s\"\"\"))\n cdf = numpy.loads(base64.b64decode( \"\"\"%s\"\"\"))\n""" % (base64.b64encode(pdf.dumps()), base64.b64encode (cdf.dumps())))

You have to make the same changes in _make_approx_data_file. Then, you run in 'levy.py build'; it will take a while (some minutes).

I don't have time to check it now, but if you can run it, please let me know if it works.

best, José

On Thu, May 26, 2016 at 6:21 PM, lukast008 notifications@github.com wrote:

Hello, thanks for this lib :)

I'm trying to run this code using Python 3.4, but I'm getting errors with base64 encoding. Could you check this: Traceback (most recent call last): File "C:\Python34\lib\base64.py", line 519, in _input_type_check m = memoryview(s) TypeError: memoryview: str object does not have the buffer interface

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "", line 1, in levy.fit_levy(x) File "C:\Python34\lib\site-packages\levy.py", line 369, in fit_levy alpha, beta, mu, sigma = parameters.get_all() File "C:\Python34\lib\site-packages\scipy\optimize\optimize.py", line 377, in fmin res = _minimize_neldermead(func, x0, args, callback=callback, *

_opts) File "C:\Python34\lib\site-packages\scipy\optimize\optimize.py", line 435, in _minimize_neldermead fsim[0] = func(x0) File "C:\Python34\lib\site-packages\scipy\optimize\optimize.py", line 285, in functionwrapper return function((wrapper_args + args)) File "C:\Python34\lib\site-packages\levy.py", line 367, in neglog_density return np.sum(neglog_levy(x, alpha, beta, mu, sigma)) File "C:\Python34\lib\site-packages\levy.py", line 315, in neglog_levy return -np.log(np.maximum(1e-100, levy(x, alpha, beta, mu, sigma, par=par))) File "C:\Python34\lib\site-packages\levy.py", line 270, in levy import levy_data File "C:\Python34\lib\site-packages\levy_data.py", line 71588, in """)) File "C:\Python34\lib\base64.py", line 561, in decodestring return decodebytes(s) File "C:\Python34\lib\base64.py", line 553, in decodebytes _input_type_check(s) File "C:\Python34\lib\base64.py", line 522, in _input_type_check raise TypeError(msg) from err TypeError: expected bytes-like object, not str

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/josemiotto/pylevy/issues/1

lukast008 commented 8 years ago

Thanks José, I changed it as you suggested, but unfortunately I still have the same issue :/

Best regards, Lukasz

josemiotto commented 8 years ago

On Tuesday I will take a look to it, thanks!

best José

On Fri, May 27, 2016 at 10:01 PM, lukast008 notifications@github.com wrote:

Thanks José, I changed it as you suggested, but unfortunately I still have the same issue :/

Best regards, Lukasz

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/josemiotto/pylevy/issues/1#issuecomment-222240385, or mute the thread https://github.com/notifications/unsubscribe/AAQp0oianbBLLtj3hYvMwHTd8RjUTqGdks5qF02jgaJpZM4Intve .

aavanian commented 7 years ago

I had to do a few more things to make it work under python 3.5. I'm not very familiar with writing python 2 backward compatible, especially re. encodings so no PR at this stage. That being said, @josemiotto, any reason why you dump the limit, pdf and cdf data this way instead of using a pickle, json file or a shelve or numpy.savez? That would bypass this issue in a transparent way, allows built-in compression in the case of savez, etc...

josemiotto commented 7 years ago

Hi, thanks for the interest, I should definitively take a look at these issues.

Honestly, the dump of the data is something inherited from the previous version of the code. It is in deed somewhat odd, but, it is very fast; in fact, pickle is the slowest methods, json is very good, but the data is very structured (is a matrix after all, for all the alpha, beta combinations), and this thing works better.

Only recently I began working with python 3, and I see that is superior in handling the different encodings. I will check the shelve and numpy.savez and come back with a solution.

best regards, and thanks Jose

On Fri, Feb 17, 2017 at 1:43 AM, Alexandre Avanian <notifications@github.com

wrote:

I had to do a few more things to make it work under python 3.5. I'm not very familiar with writing python 2 backward compatible, especially re. encodings so no PR at this stage. That being said, @josemiotto https://github.com/josemiotto, any reason why you dump the limit, pdf and cdf data this way instead of using a pickle, json file or a shelve or numpy.savez? That would bypass this issue in a transparent way, allows built-in compression in the case of savez, etc...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/josemiotto/pylevy/issues/1#issuecomment-280513690, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQp0nAmXFFBDWrQHkKoUFRFfWGyBPuCks5rdO0tgaJpZM4Intve .

aavanian commented 7 years ago

Actually, the code already uses pickle (np.ndarry.dump is a call to pickle.dump). Advantage of shelve over pickle is to store several objects (but you could also pickle a dict of the arrays). It uses pickle internally.

numpy.savez is different (I don't think it calls pickle) but works in a similar way than shelve (you can store different objects in the same file).

Here is a notebook with a quick benchmark but it's not terribly relevant:

  1. results are pretty much inline with each other except for np.savez_compressed
  2. once the interpreter imports either of the data files, these remain cached so you actually perform the deserialization only once per interpreter session (imports don't truly go out of scope) and if we switched away from loading via import, we would keep the deserialization either in the global namespace or memoized in the function(s).
josemiotto commented 7 years ago

thanks aavanian! that's great, I'm switching to the numpy.savez method. The new version, which hopefully I push today, is also py3 compatible.