Pylons / pyramid

Pyramid - A Python web framework
https://trypyramid.com/
Other
3.94k stars 882 forks source link

wiki2 tutorial - Failed building wheel for bcrypt #2590

Closed viniciusban closed 8 years ago

viniciusban commented 8 years ago

Unable to install bcrypt in Ubuntu 12.04 with Python 3.5.1.

Maybe it would be useful to write some note about this possible issue in Ubuntu 12.04.

Steps to reproduce the problem

Set dependency to py_bcrypt instead of bcrypt in setup.py.

mmerickel commented 8 years ago

I hesitated to add bcrypt for reasons like this. I would sooner change the library than recommend a different library per platform.

How are you installing python 3.5.1 on ubuntu 12.04?

viniciusban commented 8 years ago

From the deadsnakes ppa:

$ sudo add-apt-repository ppa:fkrull/deadsnakes
$ sudo apt-get update
$ sudo apt-get install python3.5

Then, to create my virtualenv:

$ python3.5 -m venv $VENV --without-pip
$ source $VENV/bin/activate
(venv) $ curl https://bootstrap.pypa.io/get-pip.py | python

BTW, I don't know if this should affect many people.

mmerickel commented 8 years ago

You should ask @dstufft to release a manylinux1 wheel on PyPI for bcrypt. However it may still not work without having libffi installed on the system. I was able to get this to install via the following:

$ docker run --rm --it ubuntu:12.04 /bin/bash
$ apt-get update
$ apt-get install python-software-properties
$ add-apt-repository ppa:fkrull/deadsnakes
$ apt-get update
$ apt-get install libffi-dev python3.5-dev build-essential
$ python3.5 -m venv --without-pip env
$ curl https://bootstrap.pypa.io/get-pip.py | env/bin/python
$ env/bin/pip install bcrypt
digitalresistor commented 8 years ago

16.04 was released last month...

mmerickel commented 8 years ago

I tried on 16.04 btw and it's a pretty similar story. You still need to install libffi-dev and python3.5-dev in order to build bcrypt.

$ docker run --rm -it ubuntu:16.04 /bin/bash
$ apt-get update
$ apt-get install libffi-dev python3.5-dev build-essential
$ python3.5 -m venv env
$ env/bin/pip install bcrypt

Also you'll still get warnings about building wheels unless you env/bin/pip install wheel. :-(

dstufft commented 8 years ago

IIRC py-bcrypt has some security bug now that renders it a bad idea, I could be misremembering though.

dstufft commented 8 years ago

Also: I believe that py-bcrypt requires python3-dev too, so it's really just the libffi-dev dependency via CFFI.

mmerickel commented 8 years ago

I could switch to another library like passlib except it wasn't clear from its docs whether bcrypt was always available... it's generally only available on *nix systems and not on windows afaik. If someone wanted to play with that we could just switch.

dstufft commented 8 years ago

Another option is to use hashlib.pbkdf2_hmac from the standard library, but it's 3.4+ and 2.7.8+ only.

mmerickel commented 8 years ago

I considered this but I'm not willing to do that given the issues with < 2.7.8. If someone else did the work and @bertjwregeer didn't mind then we could merge it. bcrypt is in there because I'm the guy that did the work and thus made the decision and I prefer bcrypt.

stevepiercy commented 8 years ago

I don't mind doing the work to replace bcrypt in the tutorial. I would need to know which one to use instead. Which ones are on the table? I collected these from the discussion:

If people build apps with security in mind, would they use <3.4 or <2.7.8?

mmerickel commented 8 years ago

If we are going to replace it I would recommend simply using hashlib.pbkdf2_hmac. It is just a tutorial after-all and it supports the versions of Python that we really care about. I do not like the passlib support for bcrypt as it's just wrapping other libraries and thus has the same problems (no support on windows without installing one of the extra dependencies, and those extra deps won't install easily on ubuntu without installing system libraries).

viniciusban commented 8 years ago

If you allow me to get into, I prefer the stdlib approach because it's already there.

But only if it would be a reasonable alternative for a real application, after all.

@mmerickel I agree it is just a tutorial, but people tend to keep using decisions they know first. Mainly newbies.

BTW, you all exceeded my expectations with new alchemy scaffold and tutorials. I'm very pleased and learning a lot with them. Keep the great work!

dstufft commented 8 years ago

PBKDF2 is a perfectly safe hash to use as long as you use enough rounds. It's roughly equivalent to bcrypt in terms of safety though it scales logarithmically on the work factor while PBKDF2 scales linearly.

digitalresistor commented 8 years ago

PBKDF2 is a perfectly acceptable algorithm to use. It just requires a little tuning to match the CPU's the code is running on. On an iPhone 5S 30,000 rounds may be acceptable (so long as the key that is generated is never ever exfiltrated from the device), but on a Core i7 running at 2.5 Ghz, you'll want at least 80,000 or more.

There are tradeoffs with every single choice for a password hashing scheme. I don't claim to know the best choice out there. Current recommendations from people I trust are in this order: scrypt, bcrypt, pbkdf2.

Argon2 is new and is even before scrypt in that list if you are willing to go with bleeding edge.

The generally accepted consensus is that you want the algorithm to run for about 1 second. Yes, this means that the rounds should go up as CPU's continue to become more efficient and faster. Back in 2000 the amount of rounds for PBKDF2 was recommended at 1000... that is not nearly enough anymore in todays age.

stevepiercy commented 8 years ago

Starting from the usage example, but using SHA512: https://docs.python.org/3/library/hashlib.html#hashlib.pbkdf2_hmac

How does this look, keeping in mind this will be stored in an SQLite database?

import hashlib, binascii
dk = hashlib.pbkdf2_hmac('sha512', b'password', b'salt', 100000)
binascii.hexlify(dk)
mmerickel commented 8 years ago

I'm -1 on anything that attempts to invent their own hashing scheme. I know I said hashlib.pbkdf2_hmac was acceptable but I'd much prefer avoiding inventing our own format to store the salt, etc as mentioned in https://github.com/Pylons/pyramid/pull/2699#issuecomment-233140474. I used bcrypt in the wiki2 tutorial but if we want to switch the dependency to passlib that's fine with me. With passlib you can use sha512_crypt and trust their judgment on the default number of rounds for the purposes of this tutorial. This will work on all the OSs without requiring any external dependencies.

ztane commented 8 years ago

Also, bcrypt had the problem that it returns bytes and did break when storing to a string field in on Python 3 without encoding.

stevepiercy commented 8 years ago

Would the change from bcrypt to passlib affect the plan in issue #2623? I don't think it will, but I want to make sure before I start on it.

mmerickel commented 8 years ago

If you're working on it I'm sure it'll be obvious once you try it out.

ztane commented 8 years ago

yes it would. bcrypt returns the base64 version of the hash as a binary string. It is supposed to be text but that lib screwed the API. In Python 2 it just didn't matter.

dstufft commented 8 years ago

base64 is not textual, it is bytes.

ztane commented 8 years ago
Python 3.5.1+ (default, Mar 30 2016, 22:46:26) 
[GCC 5.3.1 20160330] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bcrypt
>>> bcrypt.hashpw(b'123', bcrypt.gensalt())
b'$2b$12$49wvxABeVD6FyIsDuZGCK.h.axhgxTdJMqLZaW/ZJGJFzFe.1L9gy'
dstufft commented 8 years ago

Yes, that is the correct return value, cryptography always operates on bytes and returns bytes.

ztane commented 8 years ago

dstufft: hemm... "Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. The term Base64 originates from a specific MIME content transfer encoding"

ztane commented 8 years ago

The return value is text and should be considered as such. That is also what passlib does.

ztane commented 8 years ago

also, Python standard library functions spwd.getsp* return the password hashes as unicode in python 3.

dstufft commented 8 years ago
Python 3.5.2 (default, Jul  5 2016, 15:30:07)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import base64
>>> base64.b64encode(b"some binary data")
b'c29tZSBiaW5hcnkgZGF0YQ=='
dstufft commented 8 years ago

base64 is "text" in the same way that a string that is encoded as US-ASCII is "text". Just because b"this contains some us-ascii text" has some letters in it that you can read, does not in fact make it textual data that should be a unicode. The RFC explicitly states it is binary data encoded using US-ASCII, and US-ASCII are binary bytes.

dstufft commented 8 years ago

Further, from the RFC:

The encoding process represents 24-bit groups of input bits as output strings of 4 encoded characters

Encoded characters, encoded characters are binary bytes (you encode a str to get bytes, and then you decode a bytes to get a str).

IOW, the use of "Text" here is largely a byproduct of the fact that it comes from a time when people only operates on byte strings and the only difference between textual data and binary data is that you happened to be able to read the textual data when someone spewed it onto your screen (hopefully, or maybe not! Mojibake is a thing of course).

Of course once you've gotten some base64 encoded bytes you can decide you want to decode those characters into textual data for display and that's a perfectly reasonable thing to do.

dstufft commented 8 years ago

Oh, FWIW the bcrypt library has manylinux1 wheels now as does CFFI. So pip install bcrypt on the most popular Linux distros with a modern pip should not require a compiler.

ztane commented 8 years ago

8-bit binary is not ASCII. ASCII is a 7-bit character encoding where each character maps to a number. Zero-extending ASCII to 8-bit binary is a transformation format akin to UTF-16 or UTF-32. ;)

mmerickel commented 8 years ago

I agree, the manylinux1 wheels solve this issue.

~❯ docker run --rm -it python:3.5-slim bash
root@5d4369f5b678:/# python3 -m venv env
root@5d4369f5b678:/# env/bin/pip install -U setuptools pip
Collecting setuptools
  Downloading setuptools-24.0.3-py2.py3-none-any.whl (441kB)
    100% |████████████████████████████████| 450kB 2.8MB/s
Collecting pip
  Downloading pip-8.1.2-py2.py3-none-any.whl (1.2MB)
    100% |████████████████████████████████| 1.2MB 1.5MB/s
Installing collected packages: setuptools, pip
  Found existing installation: setuptools 20.10.1
    Uninstalling setuptools-20.10.1:
      Successfully uninstalled setuptools-20.10.1
  Found existing installation: pip 8.1.1
    Uninstalling pip-8.1.1:
      Successfully uninstalled pip-8.1.1
Successfully installed pip-8.1.2 setuptools-24.0.3
root@5d4369f5b678:/# env/bin/pip install bcrypt
Collecting bcrypt
  Downloading bcrypt-3.1.0-cp35-cp35m-manylinux1_x86_64.whl (57kB)
    100% |████████████████████████████████| 61kB 681kB/s
Collecting cffi>=1.1 (from bcrypt)
  Downloading cffi-1.7.0-cp35-cp35m-manylinux1_x86_64.whl (396kB)
    100% |████████████████████████████████| 399kB 2.5MB/s
Collecting six>=1.4.1 (from bcrypt)
  Downloading six-1.10.0-py2.py3-none-any.whl
Collecting pycparser (from cffi>=1.1->bcrypt)
  Downloading pycparser-2.14.tar.gz (223kB)
    100% |████████████████████████████████| 225kB 4.2MB/s
Installing collected packages: pycparser, cffi, six, bcrypt
  Running setup.py install for pycparser ... done
Successfully installed bcrypt-3.1.0 cffi-1.7.0 pycparser-2.14 six-1.10.0
root@5d4369f5b678:/#