coderholic / django-cities

Countries and cities of the world for Django projects
MIT License
921 stars 375 forks source link

import reliability issues #109

Open techdragon opened 8 years ago

techdragon commented 8 years ago

The import seems to have some serious reliability issues. While I'm sure some of these aren't due to cities doing the wrong thing. The situation around importing the data is very frustrating.

That's my big three, and so this isn't all negative, :smile: I don't mind writing a PR for specifying existing files as the import source assuming that such a feature is a welcome addition. Since this has been slowing down some integration testing work.

blag commented 8 years ago

Yeah, I kinda think it would be better to simply use the 'requests' library and keep all of the downloads in memory, but that's just my opinion and I haven't checked how big some of the downloaded files are. Etags and hashes of the file contents can help track different versions of files to keep from downloading and reimporting data.

techdragon commented 8 years ago

@blag some of them are a respectable number of megabytes. I've spent this week with 1Mbps internet and re-discovered a lot about how small is relative.

Also, I discovered this week that the current approach is 'troublesome' on AWS Elastic Beanstalk due to root folder ownership as a result of how libraries are installed into the environment. Having an option to specify "use this file ~/hypothetical/file/path.txt, its compressed|already uncompressed" is still needed, and could in theory serve as the 'building block' for more automatic commands via use of call_command(), temp files, requests, etc.

coderholic commented 8 years ago

@techdragon thanks for creating the issue.

Are you unable to track down the problem even with logging enabled, or do you not have logging enabled?

Can you give some more details around what you've experienced with corrupted files, and wanting to reuse existing files, and what's not currently working there for you?

blag commented 8 years ago

Downloading files to disk breaks deployment in simple Docker containers because they only have read-only filesystems.

davidmarquis commented 8 years ago

@blag you can always use Docker volumes, although I agree this complicates the deployment strategy

hayyyyyyden commented 7 years ago

vote +1, I've problem too when import alt_name data, it's just freezing there, without any progress....