Closed jplehmann closed 4 years ago
Well, I've done some investigation and this does indeed seem to be quite a dilemma. The issue is that Google is seemingly incorrectly geolocating this location. If I lookup this location via a webform it returns the full location details, including zip code. If I do it from a string via Python, as you've done here, it does not return the zip code.
I'll need to do some deeper investigation as to why Google doesn't want to return the zip code in some instances. I'll try and get this sorted as soon as possible, sorry for the inconvenience.
Okay, it looks like the issue is a result of two things:
This is a bit disappointing to say the least, I'd hoped that sourcing Google's geocode data would eliminate these kinds of issues, but sadly that doesn't appear to be the case.
So, I can think of a couple of possible solutions.
Okay, I've implemented some changes to try and help correct this problem. I'll explain what I've done.
So, the problem really boils down to the Python version of the geocoder not doing the same thing as the JavaScript one; when we do a lookup via JS it works properly, but not from Python. What I've implemented can be broken down into two parts:
raw
field to the address model. This stores the unmodified value that gets passed in to the lookup
routine in django-address
. This way we have the unmodified value that was requested to be stored as an address, and also the value that Google thinks the address should be.consistent
. This just shows if the raw
column matches the formatted
column. This is included so you can now sort the entries by whether or not they have the value you expected them to be. Then, when you want to fix an inconsistent address, you can use the inline lookup column to quickly find the correct address and save it.Please feel free to update the code, but please bare in mind that the migrations have changed, so you'll need to start from scratch with your original data.
And, as always, please let me know if (when) you encounter more problems. :)
Yep, you'll need to begin from scratch, migrating from your original data. I've made modifications to some of the earlier migrations so trying to rollback migrations won't work.
Cheers, Luke
On Tue, Jan 12, 2016 at 12:01 AM John Lehmann notifications@github.com wrote:
Having some problems here. I faked it back to the migration before my data migration (since it has no undo), then I roll back the initial django-address migration. I think this latter one shouldn't be necessary but even then, when I make migrations it picks up no extra field like I expect (raw).
What's worse, when I try my data migration now, I get:
File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/operations.py", line 13, in geolocate component_model=self.component_model) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 336, in to_python return to_python(*args, **kwargs) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 179, in to_python return lookup(value, instance, address_model, component_model, geolocator) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 215, in lookup addr = to_python(location.raw, instance, address_model, component_model) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 186, in to_python return _to_python(value, instance, address_model, component_model) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 117, in _to_python roots[ii], h = _save(roots[ii], 0) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 109, in _save obj.parent, h = _save(obj.parent, h + 1) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 109, in _save obj.parent, h = _save(obj.parent, h + 1) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 109, in _save obj.parent, h = _save(obj.parent, h + 1) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 109, in _save obj.parent, h = _save(obj.parent, h + 1) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 113, in _save parent=obj.parent File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/db/models/manager.py", line 127, in manager_method return getattr(self.get_queryset(), name)(*args, **kwargs) File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/db/models/query.py", line 405, in get_or_create return self.get(**lookup), False File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/db/models/query.py", line 328, in get num = len(clone) File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/db/models/query.py", line 144, in __len__ self._fetch_all() File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/db/models/query.py", line 965, in _fetch_all self._result_cache = list(self.iterator()) File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/db/models/query.py", line 238, in iterator results = compiler.execute_sql() File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 840, in execute_sql cursor.execute(sql, params) File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/db/backends/utils.py", line 79, in execute return super(CursorDebugWrapper, self).execute(sql, params) File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/db/backends/utils.py", line 59, in execute self.db.validate_no_broken_transaction() File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/db/backends/base/base.py", line 327, in validate_no_broken_transaction "An error occurred in the current transaction. You can't " TransactionManagementError: An error occurred in the current transaction. You can't execute queries until the end of the 'atomic' block.
On Sun, Jan 10, 2016 at 7:39 PM, Luke Hodkinson notifications@github.com wrote:
Okay, I've implemented some changes to try and help correct this problem. I'll explain what I've done.
So, the problem really boils down to the Python version of the geocoder not doing the same thing as the JavaScript one; when we do a lookup via JS it works properly, but not from Python. What I've implemented can be broken down into two parts:
1.
I've added a raw field to the address model. This stores the unmodified value that gets passed in to the lookup routine in django-address. This way we have the unmodified value that was requested to be stored as an address, and also the value that Google thinks the address should be. 2.
I've augmented the admin view to make it easier to identify these bogus lookups and correct them using the JavaScript geocoder. In the list display in the admin you'll be able to see a new column called consistent. This just shows if the raw column matches the formatted column. This is included so you can now sort the entries by whether or not they have the value you expected them to be. Then, when you want to fix an inconsistent address, you can use the inline lookup column to quickly find the correct address and save it.
Please feel free to update the code, but please bare in mind that the migrations have changed, so you'll need to start from scratch with your original data.
And, as always, please let me know if (when) you encounter more problems. :)
— Reply to this email directly or view it on GitHub < https://github.com/furious-luke/django-address/issues/34#issuecomment-170414225
.
— Reply to this email directly or view it on GitHub https://github.com/furious-luke/django-address/issues/34#issuecomment-170541737 .
Okay, I understand reapplying my data migration, but I need to do much more than that right?
I am trying to roll back my address all the way but it gets stuck at version 3. However, it looks like the raw column gets added in address migration 1, and I cannot move forward without applying it.
Do I need to manual drop the address table?
So I manually dropped all address tables and faked back to 0 to get started afresh. I then used the new code version, and migrated. To test I then tried to go back to zero again. I still got stuck at version 3 which is irreversible, but this time after faking it to 2 I was able to then migrate to zero without having to drop anything. Somehow my schema had gotten out of sync with the code (and I tried different code versions).
Moving forward, do we anticipate many more schema changes that will be backward incompatible -- can we not apply them as additional migrations as opposed to modifying the earlier ones already applied? This was all on my development instance, and I don't want to be doing a lot of this in production.
Thanks
1.
That sort of out-of-sync behavior will only happen if I should need to make a change to a past migration. This is pretty rare, the only reason I did this time was because without it we would have lost all the raw values after the conversion from your data to django-address. Now that we have the raw values included I shouldn’t need to make any past migration alterations. 2.
Ah yes, I know what’s up there. 3.
Yeah, I think you’re right about that. It’s usefulness is pretty much limited to future additions. I have an idea I’d like to try to make it more useful, though. I’ll implement a better matching scheme so that small deviations from Google’s calculated value don’t register as inconsistent. This would hopefully reduce the set of inconsistent addresses to just those that have actually failed to be looked up correctly. 4.
That sounds like a bug to me. I’ll get on to it today.
On Tue, Jan 12, 2016 at 1:56 AM John Lehmann notifications@github.com http://mailto:notifications@github.com wrote:
1.
Question from above, about how future migrations will work. 2.
Issue: This may have already been a problem, but just noticed that in my form, if the user edits an object which already had an address set, so the form is pre-populated with that a good value, when the user clicks Save it acts as though they did not enter a value at all and fails validation. 3.
Feedback: Consistent flag seems to be of low value just because there are so many things that can set it off, such as a missing comma or "United States" versus "USA". However it might be helpful in more quickly identifying the objects that I am converting now versus the ones in the future which will only ever store a "consistent" value. What I do find useful is being able to see where formatted is empty. 4.
Minor: I notice that every time an unresolved address is entered, it creates a new row, even if the text is duplicate. This could result in a lot of garbage over time, but I guess it can be cleaned up by deleting any addresses not referenced by the application.
— Reply to this email directly or view it on GitHub https://github.com/furious-luke/django-address/issues/34#issuecomment-170578695 .
Hi John,
Okay, I've made some changes to the way the consistency value is calculated. Hopefully now it's a bit more robust (although to be honest I'm not sure how effective it's going to be without a real dataset to test on).
In addition, the issue from 2. should be fixed now.
Cheers, Luke
On Tue, Jan 12, 2016 at 9:06 AM Luke Hodkinson furious.luke@gmail.com wrote:
1.
That sort of out-of-sync behavior will only happen if I should need to make a change to a past migration. This is pretty rare, the only reason I did this time was because without it we would have lost all the raw values after the conversion from your data to django-address. Now that we have the raw values included I shouldn’t need to make any past migration alterations. 2.
Ah yes, I know what’s up there. 3.
Yeah, I think you’re right about that. It’s usefulness is pretty much limited to future additions. I have an idea I’d like to try to make it more useful, though. I’ll implement a better matching scheme so that small deviations from Google’s calculated value don’t register as inconsistent. This would hopefully reduce the set of inconsistent addresses to just those that have actually failed to be looked up correctly. 4.
That sounds like a bug to me. I’ll get on to it today.
On Tue, Jan 12, 2016 at 1:56 AM John Lehmann notifications@github.com http://mailto:notifications@github.com wrote:
1.
Question from above, about how future migrations will work. 2.
Issue: This may have already been a problem, but just noticed that in my form, if the user edits an object which already had an address set, so the form is pre-populated with that a good value, when the user clicks Save it acts as though they did not enter a value at all and fails validation. 3.
Feedback: Consistent flag seems to be of low value just because there are so many things that can set it off, such as a missing comma or "United States" versus "USA". However it might be helpful in more quickly identifying the objects that I am converting now versus the ones in the future which will only ever store a "consistent" value. What I do find useful is being able to see where formatted is empty. 4.
Minor: I notice that every time an unresolved address is entered, it creates a new row, even if the text is duplicate. This could result in a lot of garbage over time, but I guess it can be cleaned up by deleting any addresses not referenced by the application.
— Reply to this email directly or view it on GitHub https://github.com/furious-luke/django-address/issues/34#issuecomment-170578695 .
Luke,
Using version 8b1419ea3ed69c5a60cba127cbbbded8914791ec
:
File "./manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/core/management/__init__.py", line 338, in execute_from_command_line
utility.execute()
File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/core/management/__init__.py", line 312, in execute
django.setup()
File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/__init__.py", line 18, in setup
apps.populate(settings.INSTALLED_APPS)
File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/apps/registry.py", line 108, in populate
app_config.import_models(all_models)
File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/apps/config.py", line 198, in import_models
self.models_module = import_module(models_module_name)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 19, in <module>
from .consistency import get_consistency_from_parts
File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/consistency.py", line 4, in <module>
PUNC_TABLE = str.maketrans(dict([(c, None) for c in string.punctuation]))
AttributeError: type object 'str' has no attribute 'maketrans'
thanks, John
I’m guessing you’re using Python 2? I’ve just pushed some changes to make the code more backwards compatible, if you still have trouble migrating I’ll install Python 2 and make sure it’s working locally.
By the way, I forgot to mention that after successfully migrating you’ll need to run ./manage.py checkconsistency to recalculate the consistency values. Then take a look in the admin and see if the number of spurious results has decreased to a reasonable number.
As for the form issue, I’ve just pushed some more fixes that should hopefully correct the duplicates.
Cheers, Luke
On Tue, Jan 12, 2016 at 11:41 PM John Lehmann notifications@github.com http://mailto:notifications@github.com wrote:
Luke,
Using version 8b1419ea3ed69c5a60cba127cbbbded8914791ec:
Make migrations crashes with:
File "./manage.py", line 10, in
execute_from_command_line(sys.argv) File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/core/management/init.py", line 338, in execute_from_command_line utility.execute() File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/core/management/init.py", line 312, in execute django.setup() File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/init.py", line 18, in setup apps.populate(settings.INSTALLED_APPS) File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/apps/registry.py", line 108, in populate app_config.import_models(all_models) File "/Users/john/.venv/mm/lib/python2.7/site-packages/django/apps/config.py", line 198, in import_models self.models_module = import_module(models_module_name) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/init.py", line 37, in import_module import(name) File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/models.py", line 19, in from .consistency import get_consistency_from_parts File "/Users/john/.venv/mm/lib/python2.7/site-packages/address/consistency.py", line 4, in PUNC_TABLE = str.maketrans(dict([(c, None) for c in string.punctuation])) AttributeError: type object 'str' has no attribute 'maketrans' - Still having the problem with form mentioned above. If I never change the value from the loaded value, it acts as though that field were blank. I also noticed these attempts are adding duplicate entries into the address table (raw only).
thanks, John
— Reply to this email directly or view it on GitHub https://github.com/furious-luke/django-address/issues/34#issuecomment-170899813 .
Hmm, that's interesting about the consistency check. I may have accidentally broken something.
Do you think it would be okay for you to send me a few addresses that are representative of the consistency problem? I.e. have the same value for raw and formatted, or are almost the same. It would be very helpful in diagnosing what's going wrong.
Cheers, Luke
On Wed, 13 Jan 2016 12:38 am John Lehmann notifications@github.com wrote:
- Make migrations is now fixed. Yes using Python 2.7.
- Form is now fixed -- yae! This was my blocker.
- Although on the form when I enter an address without autocompleting it (like "usa"), it still adds duplicate rows in the address table. This is not a blocker.
- I cannot tell that consistency has gotten any better, maybe even worse. Of about 30 addresses only 1 of them is marked consistent, and some are even verbatim the same text in the two columns in admin. And I did run the command.
— Reply to this email directly or view it on GitHub https://github.com/furious-luke/django-address/issues/34#issuecomment-170913830 .
So I found a pretty big problem with the consistency calculation and have since fixed it and pushed the changes. If you get an opportunity to test the consistency stuff again then please let me know how it goes.
I'm closing this issue as the main concerns originally raised seem to have been addressed .
When I have users enter an address, I'm asking for city/state/zip/country. Currently I receive these in separate fields but my plan was to only have 1 field moving forward for the
address
. But sometimes, the lookup appears to discard the zipcode:This works:
But this doesn't:
I've tried to speculate on why this could be, such as maybe this zipcode is not actually in Wausau (but it appears to be). However in any case, my concern is that I'd be throwing away data to only use the
address
field it appears. How do you recommend this be handled?It appears that this is only a problem if the user doesn't enter a street address. I am thinking I may tell users that if the zipcode doesn't "stick" they need to provide a street address in their zipcode, and I will just not display the street part.