Closed PJCampi closed 11 months ago
Thanks for your feedback, these are good comments!
#1 Aggregating annotated field and order_by
If it works to remove the ordering altogether, I could implement that, .e.g., if
qs = Blog.objects.annotate(period=Trunc('date', 'year')).order_by()
pivot(qs, 'period', 'category', 'content', Count)
fixes the issue, I could make the pivot function remove the ordering. I think this should work as the result of pivoting should be independent of order_by.
If a specific ordering is required in this case it's a bit trickier, because adding an ordering that isn't necessary in other cases would degrade performance in those cases.
#2 Default value There is a difference between None and 0 that is meaningful. If I find a bunch of records to sum up and the sum is zero, I should return the value 0. If I look for records to sum up and find no records at all, this is different. If I report 0 in the second case I'm indicating I found records that sum to zero, while if I return None I'm indicating no records were found.
Granted, in some cases people prefer to have the number 0 even in the second case I described above. There are two available ways to achieve this
pivot
from django.db.models import Count
class CountDefaultZero(Count):
template = 'COALESCE({}, 0)'.format(Count.template)
pivot(qs, 'period', 'category', 'content', CountDefaultZero)
Possibly allowing a default passed into the pivot function would be nicer:
```python
pivot(qs, 'period', 'category', 'content', Count, default=0)
or maybe it unnecessarily clutters the api. I'll give it some thought.
#3 Cumulative aggregation Does MySQL support the window function or just postgres? I'm okay with having db specific extensions, just need to think about the right way to do it. Is it possible to pass in an aggregation that does this?
If you have an implementation in mind, I'd be interested in seeing it.
It's a valid argument for Sum, less so for e.g. Count. In practice, like in your shirt sales example, if a shirt of a particular style has not been sold, it will be represented with no row present in an inventory table. The total income for that style should ideally return 0eur.
3# I think that most SQL languages support the OVER (PARTITION BY x ORDER BY y)
clause which is used by the Window function (see here for MySql). I have not tested it though, I use Postgres.
You cannot pass it as an aggregation because you need to call distinct()
afterwards.
Basically instead of:
qs.annotate(**{column: aggregator(data)})
you need to call:
qs.annotate(**{column: Window(aggregator(data), order_by: F(row).asc())}).distinct()
It also requires some kind of support for passing ordering on the row field. Let me know if you want something more concrete.
Opened https://github.com/martsberger/django-pivot/issues/9 to address point 2 above.
I opened #31 for cumulative aggregation. With that, I'm going to close this issue as all the other parts have been resolved.
Hey Marts,
First of all thanks for the package, it simple but very useful :)
I have a couple of comments/questions:
1. Aggregation on annotated field I have noticed that you do not reorder the QuerySet before aggregation, which causes problems when aggregating on an annotated field, if it is already ordered by another field.
For example, consider the model:
If I pivot on an annotated field like:
Aggregation does not work because the 'date' field must be part of the query's group by for the ordering to work.
Reordering fixes the problem, so this will work:
I think it would make sense to implement this in the package.
2. Default value The default column value is
None
atm. Why not using 0? Alternatively could one set the default value at module level? Something like:3. Cumulative aggregation It would be fairly easy to allow for cumulative aggregation as well, which can be useful. It can be done using the window function (+distinct):
It does not work on sqlite though... I'd be happy to have a look at it if you think it could be useful.