Description

Currently, if a pipeline requires a different extracting method than the one defined in base/, we need to rewrite the extract method from scratch. Many pipelines cannot tell if images come from the same study. In such cases, we use the source img id as the study id and we use 0.png as our img_id. Several pipelines redefine the _extract method to do it. We would like to have a set of private methods for image extractors that will allow us to reuse existing methods rather than the exact logic of several pipelines.

What to do

Look at what extracting methods are defined across pipelines/.
Create new private methods in BaseStudyIdExtractor that cover the logic required by the pipelines. e.g.
```
def _extract_from_filename()
return os.path.basename(path)
```

Redefine instances of ImgIdExtractor across pipelines to use the new methods. e.g. pipelines/alzheimers.py

class StudyIdExtractor(BaseImgIdExtractor):
"""Extractor for image IDs specific to the Alzheimer's dataset."""

def _extract(self, img_path: str) -> str:
    return _extract_from_basename()

Expected behaviour

BaseStudyId extractor has private methods for different extracting operations, e.g. extracting the filename, or returning "0.png".

Current behaviour

Each pipeline defines its own extracting method if it differs from the base. Many pipelines require the same method for extracting but each of them defines it separately.

TheLion-ai / UMIE_datasets

Add premade extracting methods for StudyIdExtractor #73

Description

What to do

Expected behaviour

Current behaviour