Currently, the Dataset_File model uses a Django CharField to represent of the file it describes. This causes problems with operations such as aggregation across sets of datafiles.
A CharField was originally used as (pre?) Django 1.0 there were problems with the size of numbers that could be stored in an IntegerField. Given the large sizeof some datafiles that would be stored, this was not practical, and so it was decided the size would be represented as a CharField instead. This, however, creates problems with aggregation functions across large numbers of datafiles. Given the large number of datafiles that most deployments will be dealing with, it's important that datafile metadata fields can be efficiently accessed and aggregated, so it's probably a good idea to see whether there have been any changes in Django 1.3 that will allow us to use an IntegerField here.
Currently, the Dataset_File model uses a Django CharField to represent of the file it describes. This causes problems with operations such as aggregation across sets of datafiles.
A CharField was originally used as (pre?) Django 1.0 there were problems with the size of numbers that could be stored in an IntegerField. Given the large sizeof some datafiles that would be stored, this was not practical, and so it was decided the size would be represented as a CharField instead. This, however, creates problems with aggregation functions across large numbers of datafiles. Given the large number of datafiles that most deployments will be dealing with, it's important that datafile metadata fields can be efficiently accessed and aggregated, so it's probably a good idea to see whether there have been any changes in Django 1.3 that will allow us to use an IntegerField here.
original LH ticket
This ticket has 0 attachment(s).