OML-Team / open-metric-learning

Metric learning and retrieval pipelines, models and zoo.
https://open-metric-learning.readthedocs.io/en/latest/index.html
Apache License 2.0
879 stars 61 forks source link

Make shortcuts for imports #535

Closed AlekseySh closed 4 months ago

AlekseySh commented 6 months ago

from oml.datasets.base import X -> from oml.datasets import X So, make all the functions and classes accessible with no more than two-level imports

mimi030 commented 6 months ago

Hello! I'm interested in contributing to this issue. Is it still open?

Could you confirm if the goal here is to simplify imports and enhance readability by reducing import depth across all affected files?

AlekseySh commented 6 months ago

Hey, @mimi030 Thank you for the interest!

That's correct. Ideally, I want to have no more than 2 levels in our imports. For example, from oml.losses.triplet import TripletLossWithMiner -> from oml.losses import TripletLossWithMiner. Also, don't forget to update examples (check the contribution guide before, please).

I think, you can start with losses, models and miners. I would'n touch validation part much at the moment because we are expecting big updates there.

mimi030 commented 6 months ago

Thanks for clarifying the scope, @AlekseySh !

I'll start by simplifying import paths within the 'losses,' 'models,' and 'miners' modules to ensure they don't exceed two levels. For example, I'll refactor import statements like this:

# Example from oml/models/vit_unicom/extractor.py
# Before refactoring
from oml.interfaces.models import IExtractor
from oml.models.utils import (
    remove_criterion_in_state_dict,
    remove_prefix_from_state_dict,
)
from oml.models.vit_unicom.external import vision_transformer
from oml.models.vit_unicom.external.model import load  # type: ignore
from oml.utils.misc_torch import normalise

# After refactoring
from oml.interfaces import IExtractor
from oml.models import (
    remove_criterion_in_state_dict,
    remove_prefix_from_state_dict,
    vision_transformer,
    load,  # type: ignore
)
from oml.utils import normalise

Could you also clarify which components are included in the 'validation part' you mentioned? Is it related to the extractor_training_pipeline and extractor_validation_pipeline, or the validation steps outlined in the documentation? Thanks for your help!

I'll also update relevant examples in the documentation to reflect these changes.

AlekseySh commented 6 months ago

Your example looks great!

By validation part I mean postprocessors, metrics and so on. So, that is why I suggest to start with losses, miners, samplers, and models. You can simplify their imports everywhere.

mimi030 commented 6 months ago

Thanks for your guidance! I've started simplifying imports across all files in losses, models, miners, and samplers modules. However, a couple of questions popped up:

  1. I noticed some optional dependencies like from pytorch_grad_cam.utils.image import show_cam_on_image within functions in certain files. Should we simplify these as well?
  2. When you mentioned "You can simplify their imports everywhere," does this include simplifying imports from the losses, models, miners, and samplers modules across the entire project, including registry and test files?
  3. Regarding the documentation updates, are we focusing on simplifying imports in the "feature extraction" examples? Can you confirm if the example below aligns with the changes?
# Example from 'Training' from Examples in 'Feature Extraction'

# Original import statements:
import torch
from tqdm import tqdm
from oml.datasets.base import DatasetWithLabels
from oml.losses.triplet import TripletLossWithMiner
from oml.miners.inbatch_all_tri import AllTripletsMiner
from oml.models import ViTExtractor
from oml.samplers.balance import BalanceSampler
from oml.utils.download_mock_dataset import download_mock_dataset

# Revised import statements:
import torch
from tqdm import tqdm
from oml.datasets.base import DatasetWithLabels
from oml.losses import TripletLossWithMiner
from oml.miners import AllTripletsMiner
from oml.models import ViTExtractor
from oml.samplers import BalanceSampler
from oml.utils.download_mock_dataset import download_mock_dataset
AlekseySh commented 6 months ago

Hi!

  1. If something is inconvenient so simplify, especially helper function with optional requirements like show_cam_on_image, keep in untouched. In the end, the main goal is to have shorter access to the main classes and functions.
  2. I think simplifying imports in tests is just wasting your time, so, don't do it. But in the /oml module we can simplify it, yes.
  3. Almost, i would say:
import torch
from tqdm import tqdm
from oml.datasetsimport DatasetWithLabels
from oml.losses import TripletLossWithMiner
from oml.miners import AllTripletsMiner
from oml.models import ViTExtractor
from oml.samplers import BalanceSampler
from oml.utils import download_mock_dataset
AlekseySh commented 6 months ago

Anyway, it's hard to to predict all the possible combinations in advance. So, you can create a relatively small PR with only criterions optimization (for example), where we can discuss corner cases which can arise.

mimi030 commented 6 months ago

Thanks for the suggestion! I've submitted a pull request with initial changes focusing on import simplification. Feel free to review it when you have time.

One note: the import simplification led to some test errors. Also, I needed to install python-dotenv, sphinx, sphinx-mdinclude, and sphinx-rtd-theme to run tests and build documentation. Should we include these in the installation package or document them?

If you have other adjustments or optimizations to discuss, I'm ready to work on those. Thanks!

AlekseySh commented 6 months ago

@mimi030 we should not include this to the package to make the original package lighter. But I think it would be good if you add this information to the contribution guide: https://github.com/OML-Team/open-metric-learning/blob/main/CONTRIBUTING.md . You can mention that contributor may need to install one of:

mimi030 commented 6 months ago

Hey @AlekseySh,

I made another pull request, could you review this one? Thank you!

AlekseySh commented 4 months ago

Thanks for your work. To finish everything faster I took some of this code and merged it along with OML 3.0 release, because I needed short imports so new examples look nice.

What would be really cool now -- is further imports shortification to allow 1-level import like from oml import TripletLoss. It would be super great but I'm afraid also super hard because of circular imports. I'm still thinking what can we do here. Probably, we can moved only part of the functions and classes to the first level.

mimi030 commented 4 months ago

@AlekseySh Thanks for your feedback and integrating the code into the OML 3.0 release.

Regarding shortening imports to enable 1-level import, could you specify which files you're considering for modification? I'm aware of circular imports and testing time. Last time, I shifted some functions to "init.py" to enable 2-level imports. Do you think it's feasible again?

If this requires extensive changes across many files, I might need to pass the torch as my present laptop can't handle extensive testing.

Again, appreciate the chance to be part of this important work—thank you!

AlekseySh commented 4 months ago

@mimi030 I think you are right and it may require a lot of functions or classes to move, so, probably this changes should be done by me, after I have better understanding of how exactly it should be done.