tulsawebdevs / django-multi-gtfs

Django app to import and export General Transit Feed Specification (GTFS)
http://tulsawebdevs.org/
Apache License 2.0
50 stars 32 forks source link

Use .iterator() to save RAM during export #80

Closed misli closed 6 years ago

misli commented 6 years ago

Method Base.export_txt doesn't work for large querysets on models, where _sort_order[0] doesn't contain '__' (Route, Service, ServiceDate). It raises AssertionError (see the traceback below) when spliting large queryset. However, the problem with large querysets may be solved much easier (and even much less RAM consuming) way with .iterator() (https://docs.djangoproject.com/en/2.0/ref/models/querysets/#iterator).

Traceback:
File "./multigtfs/models/feed.py" in export_gtfs
  169.             content = klass.export_txt(self)
File "./multigtfs/models/service.py" in export_txt
  94.         return super(Service, cls).export_txt(feed)
File "./multigtfs/models/base.py" in export_txt
  372.             assert '__' in field1_raw
Exception Type: AssertionError
jwhitlock commented 6 years ago

The test failures are due to an issue with PostGIS and TravisCI. I've merged the Django 2.0 branch, so if you rebase they should pass.

I think the current Trimet feed would be a good test of the speed impact of this change. I'm not aware of a good way to measure RAM impact in Python / Django.

coveralls commented 6 years ago

Coverage Status

Coverage remained the same at 100.0% when pulling b493b1dfadecda1d7ee5d01fe9f84de280610b2d on bileto:iterator into 7d306965d9710bb4124d18538ab503798a41bcd3 on tulsawebdevs:master.