Closed jsheunis closed 1 year ago
Thanks, having an example translator that goes along the catalog PR is very helpful!
It looks to me that, compared to the "old" generation of translators, the "new" class needs to:
get_supported_()*
methods (these would be used by match()
inherited from TranslatorBase
)translate()
method which takes metadata as an inputI looked at the proposed Translate
implementation in the catalog, and wonder about two things. I'm putting them here just to share my thoughts:
Would it be better to expect the translator class to implement the match()
method directly, instead of having the methods that report supported name, version etc? (BTW, we should probably also consider passing extractor ID, translators may use this info or not). This way, the TranslatorBase
class wouldn't need impose any particular logic. So if the extractor uses semver, PEP440, or any other versioning scheme, the translator could rely on available comparison methods from e.g. packaging or looseversion and implement any matching logic that's needed. Of course, already there is nothing that prevents the translator class from overriding match()
.
Would it be better to have the match()
method be a @classmethod
, so the translator wouldn't need to be instantiated until we know that it matches? This way metadata could still go to __init__
, although maybe it isn't a pattern worth preserving.
Expressed in code rather than prose, TranslatorBase
would have something like:
@classmethod
@abstractmethod
def match(self, source_name: str, source_version: str, source_uuid: str|None = None) -> bool:
Thanks for this feedback.
Would it be better to expect the translator class to implement the match() method directly, instead of having the methods that report supported name, version etc? (BTW, we should probably also consider passing extractor ID, translators may use this info or not). This way, the TranslatorBase class wouldn't need impose any particular logic. So if the extractor uses semver, PEP440, or any other versioning scheme, the translator could rely on available comparison methods from e.g. packaging or looseversion and implement any matching logic that's needed. Of course, already there is nothing that prevents the translator class from overriding match().
Perhaps it's useful to instruct developers that they can (or are encouraged to?) provide a custom match()
implementation that overrides the method from the base class, but we still keep the current implementation on the base class so that there's something to fall back on?
Would it be better to have the match() method be a @classmethod, so the translator wouldn't need to be instantiated until we know that it matches? This way metadata could still go to init, although maybe it isn't a pattern worth preserving.
Good point! I agree.
I updated the PR with changes to the catalog's TranslatorBase
that were made since. I will either merge it now, or branch off of that, and work on updating other extractors. Thanks again for providing the reference implementation!
To be used for testing of https://github.com/datalad/datalad-catalog/pull/246