Add a 'prefix' feature along with documentation and tests for this fe…

jacklinke commented 8 months ago

django-sqids builds off of django-hashids, but one of the nice features of its sister package, django-hashid-field was the ability to provide a per-field prefix that was prepended to the resulting Hashids for that model field.

This pull request adds such a feature to django-sqids - an optional prefix that can be provided as an argument in the model field's implementation. For instance:

class TestModel(Model):
    sqid = SqidsField(real_field_name="id", min_length=7, prefix="P-")
    other_sqid = SqidsField(real_field_name="id", min_length=9, prefix="user_")

will result in model instances that have sqid values like "P-J3TAC2V" or "P-M2A87BX", and other_sqid values like "user_3N9CAW12V" and "user_J90MGN4RD".

Included in the pull request are:

Changes to SqidsField to implement prefixes
Tests (and test_app models) for the prefix feature (and for DRF serialization with or without a prefix)
Updates to Readme to document the new feature
A new Exception IncorrectPrefixError to help identify when the prefix is used incorrectly

Please let me know if you find this useful. I hope you will consider merging it.

julianwachholz commented 8 months ago

Thank you for this PR and contribution, Jack!

I had a look at your proposed changes and would like to merge this after some points for discussion:

I see a conflict in the way invalid prefixes are handled. Let's take this example:
```
# urls.py
urlpatterns = [path("<str:sqid>/", views.detail)]
# views.py
def detail(req, sqid):
 obj = get_object_or_404(queryset, sqid=sqid)
```
If the given model does not use a prefix, trying to access this view with an invalid sqid results in a 404. This is the same behavior you get when using e.g. <int:pk> because of no match in the patterns. This is the result of the upstream Sqids library returning an empty list when given an invalid string as input.

If the given model however does use a prefix, trying to access the same view with an invalid one will result in a server error. I do not think this is desired behavior.

The url pattern needs to use str, because we don't know which patterns make a valid Sqid (or maybe slug or path).

An alternative approach would be to add the prefix to your patterns themselves:
```
urlpatterns = [
 path("api/detail/H-<sqid>/", views.house_detail),  # "H" as a sort of prefix illusion
 path(("api/detail/G-<sqid>/", views.garden_detail),  # Looks like the same route from outside
]
```
I previously worked on a large project before that used django-hashid-field with prefixes and I personally saw no benefit of having them, other than how they looked in URLs. Do you have a more specific use case in mind that would require more than just the prefixes in URL patterns?
This is a rather big change and the DRF tests + dependency update would be better suited in a separate commit. Would you be willing to split your commit and submit a separate PR? I also recommend you update Poetry (The newly committed lockfile was generated with 1.6.1)

Side note: this PR sparked an interesting thought in my head: It would be possible to use regular Sqids for all possible object instances while maintaining full separation: by also encoding the content type ID as the first number in the sqid! I can see this as a possible addition to this project.

def every_detail(r, sqid):
  content_type, object_id = sqids_instance.decode(sqid)
  ...

urlpatterns = path("detail/<sqid>/", every_detail)

jacklinke commented 8 months ago

@julianwachholz thanks for your input. I just submitted a separate PR for the DRF portion.

I will rework this PR as requested, and implement some tests for urls once I resolve that issue.

As for use-case, I really like the prefixes as a simple identifier. The main application I work on provides a suite of tools for utilities districts. Each service request starts with "SR-" Each Invoice starts with "I-", Each Serial Number starts with "S-".

The prefix just provides some quick context about what the identifier we're looking at belongs to.

I'm not sure if anyone else will find it useful, but I really like being able to provide a short pseudo-random value with a bit of added context for each record my customers see.

That content-type idea is pretty cool!

jacklinke commented 8 months ago

@julianwachholz

Please let me know if this meets the intent.

I added a set of tests for urls functionality with and without prefix. Here's the current behavior - let me know if it still needs rework:

-	Model without prefix	Model with prefix
prefix not provided	✅	IncorrectPrefixError is raised
wrong prefix provided	N/A (404)	IncorrectPrefixError is raised
prefix provided	404	✅

Note: I did leave the DRF dev requirement in place, since the tests in this PR check that prefixes serialize correctly, but I moved all DRF-only tests to #2.

Edit:

If the user is using sqids without a prefix, everything works 100% like normal. A correct sqid returns the view, an incorrect sqid returns 404.

In the case of using sqids with prefix, any incorrect sqid currently returns IncorrectPrefixError.

There are 3 ways I can think of the deal with this:

1: Modify `get_prep_value`

Modify get_prep_value to return None if the prefix is required but not present. Then a url expecting prefix, but not getting one, would return a 404 (but it potentially leaves the reason ambiguous).

2: Add a setting

Add a settings option to allow the user to determine which behavior they want from get_prep_value: either None or raise IncorrectPrefixError.

3: Leave it how it is now

Leave it how it is now, but if they want to catch IncorrectPrefixError and return 404, there are a couple approaches.

Approach 1: Modifying Views

For CBV

from django.http import Http404
from django.views.generic import DetailView
from .models import TestModelWithPrefix
from .exceptions import IncorrectPrefixError

class SomeDetailView(DetailView):
    model = TestModelWithPrefix
    slug_field = "sqid"

    def get_object(self, queryset=None):
        queryset = queryset or self.get_queryset()
        sqid = self.kwargs.get(self.slug_url_kwarg)
        if sqid is not None:
            sqid_field = self.get_slug_field()
            try:
                obj = queryset.get(**{sqid_field: sqid})
            except IncorrectPrefixError:
                raise Http404(
                    "No instance matches the given Sqid with the correct prefix."
                )
            except queryset.model.DoesNotExist:
                raise Http404(
                    "No instance matches the given Sqid with the correct prefix."
                )
            return obj
        else:
            raise AttributeError(
                "Generic detail view %s must be called with either an object pk or a slug (sqid)."
                % self.__class__.__name__
            )

Or for FBV

from django.http import Http404
from django.shortcuts import get_object_or_404, render
from .models import TestModelWithPrefix
from .exceptions import IncorrectPrefixError

def some_detail_view(request, sqid):
    try:
        instance = get_object_or_404(TestModelWithPrefix, sqid=sqid)
    except IncorrectPrefixError:
        raise Http404("No instance matches the given Sqid with the correct prefix.")

    return render(request, 'instance_detail.html', {'instance': instance})

Approach 2: Middleware

Alternatively, a user could implement a middleware that catches IncorrectPrefixError so they don't need to modify every view. Something like this:

from django.http import Http404
from .exceptions import IncorrectPrefixError

class SqidsIncorrectPrefixMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        response = self.get_response(request)
        return response

    def process_exception(self, request, exception):
        if isinstance(exception, IncorrectPrefixError):
            raise Http404("Incorrect sqid prefix in request.")
        return None

Do any of these seem reasonable?

I welcome your thoughts on this.

julianwachholz commented 8 months ago

I believe option 1 to be the least surprising and most in line with how Django's path converters work. Imagine registering a custom converter for every SqidsField, the expected behavior as documented here would be to raise a ValueError, which will in turn be treated as a 404.

jacklinke commented 8 months ago

That makes sense. Updated the PR to behave more in-line with the expected results, as requested.

-	Model without prefix	Model with prefix
prefix not provided	✅	404
wrong prefix provided	404	404
prefix provided	404	✅

julianwachholz commented 8 months ago

Thank you for your contribution, @jacklinke!

julianwachholz / django-sqids