dmarx / Multi-Modal-Comparators

Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP
39 stars 4 forks source link

Wrapper class for mocking the openai CLIP.clip module #31

Closed dmarx closed 2 years ago

dmarx commented 2 years ago

facilitates importing preprocessing facilities separately from the model, following the openai clip idiom

dmarx commented 2 years ago

I had a similar thought but was planning to take it in a slightly different direction: modify the mocked clip.available_models() to query the registry (and support openai's aliases), and modify the mocked clip.load() to accept mmc model names. So then all a user would have to do is replace

from CLIP import clip

With

from mmc.mock.openai.CLIP import clip

And from there, they could do stuff like

model = clip.load(id='RN50')

or

`model =clip.load(id='[clip - sajjjadayobi - clipfa]')

Also, to be clear: i think what you described is already how loading from the registry works. REGISTRY.find() returns an unmocked mmc loader, like in your example. The solution i just described would return a mocked model.

The benefit of wrapping a single specific loader like in my earlier example is that it lets us put the approproate tokenizer in clip.tokenize (instead os the SimpleTokenizer). We lose this ability with the method I just described though, so maybe the mocked clip.available_models could only list those compatible with openai's tokenizer. Or maybe we add something like mmc.get_preprocessors(**kargs) to make stuff like tokenizers fetchable from queries

On Wed, May 25, 2022, 19:39 apolinario @.***> wrote:

@.**** commented on this pull request.

In tests/test_module_mock.py https://github.com/dmarx/Multi-Modal-Comparators/pull/31#discussion_r882266296 :

  • clip = MockOpenaiClipModule(ldr)
  • +def test_available_models():

  • from mmc.mock.openai import MockOpenaiClipModule
  • from mmc.loaders import OpenAiClipLoader
  • ldr = OpenAiClipLoader(id='RN50')
  • clip = MockOpenaiClipModule(ldr)
  • assert clip.available_models() == str(ldr)
  • +def test_private_mmc():

  • from mmc.mock.openai import MockOpenaiClipModule, MockOpenaiClip
  • from mmc.loaders import OpenAiClipLoader
  • from mmc.multimodalcomparator import MultiModalComparator
  • ldr = OpenAiClipLoader(id='RN50')

I feel having to choose which loader to use defeats the purpose of automation that mocking the CLIP module could allow for. What if instead of ldr = OpenAiClipLoader(id='RN50') We had something like ldr = OpenMMCLoader(id='[clip - sajjjadayobi - clipfa]') That accepts and is compatible to all valid MMC perceptors

The main idea of mock_api_plus was that it would carry the spirit of mock_api, where when we mock OpenAI we still allow for any of the supported perceptors to be loaded to it with just - so I just wanted to extend that so the user can be able to use the tokenizer and image preprocessor of all models without having to create a lot of conditions on their end

— Reply to this email directly, view it on GitHub https://github.com/dmarx/Multi-Modal-Comparators/pull/31#pullrequestreview-985662541, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALGEAMFMM3IS4MJKTB5F73VL3P7LANCNFSM5WQDLINQ . You are receiving this because you authored the thread.Message ID: @.***>

apolinario commented 2 years ago

We lose this ability with the method I just described though

I think this is not worth it. I think what we would want to optimize is that people can load MMC and support as many perceptors as they can without doing any extra conditions like "if clip farsi than load this tokenizer"

I feel ideally someone with a list of all available perceptors that contain both TEXT and IMAGE modalities could use the same code for CLIP guiding all without any extra condition

dmarx commented 2 years ago

I agree! I'm suggesting that rather than having any conditional stuff, we'd make it so they could fetch whichever tokenizer they'd need via something like tokenizer = mmc.get_preprocessor(model, modality), which would work for all models.

On Thu, May 26, 2022, 00:00 apolinario @.***> wrote:

We lose this ability with the method I just described though, I think this is not worth it. I think what we would want to optimize is that people can load MMC and support as many perceptors as they can without doing any extra conditions like "if clip farsi than load this tokenizer"

I feel ideally someone with a list of all available perceptors that contain both TEXT and IMAGE modalities could use the same code for CLIP guiding all without any extra condition

— Reply to this email directly, view it on GitHub https://github.com/dmarx/Multi-Modal-Comparators/pull/31#issuecomment-1138227172, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALGEAP4G27L2KXW4URNQWLVL4OQFANCNFSM5WQDLINQ . You are receiving this because you authored the thread.Message ID: @.***>

apolinario commented 2 years ago

Gotcha. Okay, that's neat

dmarx commented 2 years ago

just gonna merge this as-is for now. we've got some cool idea here that I plan to continue forward with, but no reason to keep this useful chunk of code holed up while I figure out how I want the fancier version of mocking to work