Closed NickCrews closed 5 months ago
i like the simplicity of this approach, but it will be a breaking change for data models that include datetime. in this code this is only working because of the surviving name spacing bits we are trying to get away from.
it will also be a be a breaking change for data models that include the name, address, and fuzzy category types that are not bundled with dedupe but are recommended in the docs
while this pattern is nicer for devs, i don’t like that if a library’s file structure changes, that can break a downstream users code. the file structure should not have to be part of an external API (though i grant that it often is)
i like the simplicity of this approach, but it will be a breaking change for data models that include datetime. in this code this is only working because of the surviving name spacing bits we are trying to get away from.
Hmmm, you're right. Two possible solutions I see:
variables/__init__.py
we can say from dedupe_variable_datetime import DateTimeType as DateTime
it will also be a breaking change for data models that include the name, address, and fuzzy category types that are not bundled with dedupe but are recommended in the docs
Hmm, that's a good point. I was just thinking of breaking changes to people using their own custom plugins. I think the best option here is temporarily have a hardcoded mapping for these edge cases (eg "FuzzyCategorical"->"dedupe_variable_fuzzycategory:FuzzyCategorical"), and if you use them then you get a deprecation warning. We can remove them later, after users get a chance to switch. I think eventually they should follow the same rules as other 3rd party plugin variables, since they are external and need to be pip installed.
while this pattern is nicer for devs, i don’t like that if a library’s file structure changes, that can break a downstream users code. the file structure should not have to be part of an external API (though i grant that it often is)
That's a good point. I think that is the responsibility of the plugin authors and is easily avoided. e.g. in a zipcodelib
package, in the root __init__.py
, they should from zipcodelib.foo.bar import ZipCodeVariable
, so that users can just use "zipcodelib:ZipVariable". Then the plugin author is free to move ZipCodeVariable from zipcodelib/foo/bar.py to zipcodelib/baz.py, adjust the import in the root __init__.py
, and everything still works.
closed by #1193
Will close https://github.com/dedupeio/dedupe/issues/1085
This probably could do with some more polishing, eg removing
type
from every Variable class, and swappingif variable_type == "Interaction"
to parsing the variable type, then checkingif isinstance(variable_type, dedupe.variables.InteractionType)