dcwatson / django-pgcrypto

Python and Django utilities for encrypted fields using pgcrypto.
BSD 2-Clause "Simplified" License
69 stars 23 forks source link

Empty string in DB crashes the lib #30

Closed tomaszsmolarek closed 6 months ago

tomaszsmolarek commented 3 years ago

Hi,

We're using an existing DB and we're migrating to an encrypted solution, so some of our DB data has empty strings - code crashes when calling Django bulk update method.

If you want to use SQL dearmor with an empty string it'll throw an error "Corrupt ascii-armor" - this means that there should be some handling in the library to prevent that.

A test which shows the problem:

    def test_empty_db_string_compare_with_non_blank(self):
        obj = Employee.objects.create(name="Test User 2", date_hired=datetime.date.today(), email="test2@example.com")
        with transaction.atomic():
            with connection.cursor() as cursor:
                # Manually overwrite DB value to empty string
                cursor.execute("UPDATE %s SET ssn = '' WHERE id = %%s;" % (obj._meta.db_table,), [obj.pk])
        obj.refresh_from_db()
        self.assertEqual(obj.ssn, "")
        self.assertNotEqual(obj.ssn, "NON_EMPTY_STRING")
        # Try performing a bulk update in Django - in SQL dearmor("") throws an error "Corrupt ascii-armor"
        Employee.objects.filter(**{"pk": obj.pk}).exclude(**{"ssn": "XYZ"}).update(**{"ssn": "XYZ"})
        obj.delete()

A possible fix:

        if isinstance(self.lhs.field, EncryptedCharField) or isinstance(self.lhs.field, EncryptedTextField):
            # Special case when value in DB is an empty string - use NULL - otherwise dearmor explodes
            return (
                "COALESCE(convert_from(decrypt(dearmor(NULLIF(%s, '')), %%s, '%s'), 'utf-8'),%s)%s %s"
                % (lhs, self.lhs.output_field.cipher_name, lhs, self.lhs.output_field.field_cast, rhs),
                params,
            )
        else:
            return (
                "convert_from(decrypt(dearmor(%s), %%s, '%s'), 'utf-8')%s %s"
                % (lhs, self.lhs.output_field.cipher_name, self.lhs.output_field.field_cast, rhs),
                params,
            )

i.e. a check for column type and then two SQL functions called COALESCE and NULLIF.

I have a fork, but before I post anything I'd like to get some feedback. :)

Thanks!