jwjacobson / jazztunes

a jazz repertoire management app
https://jazztunes.org
GNU General Public License v3.0
3 stars 1 forks source link

figure out why ORM shuffling does not work #116

Closed bbelderbos closed 8 months ago

bbelderbos commented 8 months ago

views.py

    # TODO: figure out why shuffling does not work
    if len(tunes) < 3:
        suggested_tune = tunes.first()
    else:
        suggested_tune = tunes.order_by("?").first()
jwjacobson commented 8 months ago

As I'm looking into this I'm finding a lot of sources saying order_by("?") is rather inefficient. Since the random selection is an important part of the app functionality, might it be worth using python's random module instead? On the other hand I know that since what is getting randomly ordered is only the results of a particular search and even the final database won't be very large it might not make a huge difference...

bbelderbos commented 8 months ago

Yeah it won't probably be noticeable at all.

Indeed:

https://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.order_by

Note: order_by('?') queries may be expensive and slow, depending on the database backend you’re using.

bbelderbos commented 8 months ago

This works though:

In [1]: from tune.models import Tune

In [2]: Tune.objects.count()
Out[2]: 21

In [3]: Tune.objects.order_by("?").first()
Out[3]: <Tune: Tune 13 | Bye-Ya>

In [4]: Tune.objects.order_by("?").first()
Out[4]: <Tune: Tune 13 | Bye-Ya>

In [5]: Tune.objects.order_by("?").first()
Out[5]: <Tune: Tune 20 | Countdown>

In [6]: Tune.objects.order_by("?").first()
Out[6]: <Tune: Tune 19 | Giant Steps>
bbelderbos commented 8 months ago

Interesting:

https://stackoverflow.com/questions/962619/how-to-pull-a-random-record-using-djangos-orm

Top answer: MyModel.objects.order_by('?').first()

But there is another interesting take in another answer:

        count = self.aggregate(count=Count('id'))['count']
        random_index = randint(0, count - 1)
        return self.all()[random_index]

However as the comments say then you're doing 2 queries.

So yeah not a problem to stick with random till this becomes an issue ...