open-contracting / extension_registry.py

Eases access to information from the extension registry of the Open Contracting Data Standard
https://ocdsextensionregistry.readthedocs.io
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Add method to get the latest version of a extension #41

Open yolile opened 1 month ago

yolile commented 1 month ago

Similar to ExtensionRegistry.get() method but for the latest version, not the first one (or add a parameter to this method)

jpmckinney commented 1 month ago

What's the context for this need?

yolile commented 1 month ago

I'm using the library to get the list of all the latest versions of core extensions to include them in the mapping template. But I'm using __iter__() for now, so this issue is not a priority.

jpmckinney commented 1 month ago

Do you want the latest tagged version (if the extension has tags) or just the latest version? Because the latest version is always the one without a date (until we start having both 1.1 and 1.2 extensions in the registry).

yolile commented 1 month ago

Well, I want the ones that should be in the field-level mapping template, so I guess the latest tagged version (e.g. not the one in the master branch)

jpmckinney commented 1 month ago

For now you can just hardcode these to use v1.1.5. Rest use the one with no date.

bids enquiries location lots milestone_documents participation_fee process_title

yolile commented 1 month ago

I'm doing

    base_url = 'https://raw.githubusercontent.com/open-contracting/extension_registry/main'
    registry = ExtensionRegistry(f'{base_url}/extension_versions.csv',
                                 f'{base_url}/extensions.csv')

    core_extensions = set([registry.get(id=extension.id,
                                        version=max([version.version for version in registry.__iter__()])).base_url
                           for extension in registry.filter(core=True)])

With this as the result:

{'https://raw.githubusercontent.com/open-contracting-extensions/ocds_location_extension/v1.1.5/', 
'https://raw.githubusercontent.com/open-contracting-extensions/ocds_process_title_extension/v1.1.5/', 
'https://raw.githubusercontent.com/open-contracting-extensions/ocds_milestone_documents_extension/v1.1.5/', 
'https://raw.githubusercontent.com/open-contracting-extensions/ocds_lots_extension/v1.1.5/', 
'https://raw.githubusercontent.com/open-contracting-extensions/ocds_enquiry_extension/v1.1.5/', 
'https://raw.githubusercontent.com/open-contracting-extensions/ocds_bid_extension/v1.1.5/', 
'https://raw.githubusercontent.com/open-contracting-extensions/ocds_participationFee_extension/v1.1.5/'}
jpmckinney commented 1 month ago

That works, though max(['1.1.5', '1.1.11']) is 1.1.5 instead of 1.1.11. Probably not a problem for us, but to be correct would need to be something like max(tuple(map(int, version.split('.'))) for version in ['1.1.5', '1.1.11'])

Also, calling iter multiple times is slow. Can cache it with extension_versions = list(registry) , then iterate over that.

jpmckinney commented 1 month ago

Oh, probably more correct to change to for version in registry.filter(id=version.id) that way you're not picking the max among all extensions (which would only need to be calculated once, if that were the intention).