Closed johanneszellinger closed 10 months ago
I don't know much about the huggingface side of things, so I'd take what I say with a grain of salt... However, it looks like the config files contain a mix of SAM model config parameters and then loads of junk that has nothing to do with the SAM model (but maybe needs to be there for compatibility with the rest of the library?).
For example, there are nonsense entries like: end-of-sentence token id but these are mixed with real SAM parameters like the layer indices for global attention. So making sense of the config files probably requires hunting around for which entries are actually relevant.
The config.json
file has 3 important sections: mask_decoder_config, prompt_encoder_config and vision_config, these seem to correspond with the model config for classes: MaskDecoder, PromptEncoder and ImageEncoderViT, respectively. The values in the config file seem to be for the 'huge' variant of the SAM model, which is set within the build_sam.py script. Similarly, the preprocessor_config.json
seems to reference the preprocessing steps found in the SAM model class and parts of the predictor. So those are the places I would look for understanding the values/ranges of the different settings.
That being said, the values all seem to be related to the model structure. Changing these values makes sense if the goal is to load a different set of weights (like the 'base' or 'large' weights), but would otherwise break the model, since they aren't the kind of values that can be tuned for better performance or anything (at least, for a given set of weights).
Thank you a lot for the detailed information - this is already quite helpful!
Yes, this is also the first time for me using huggingface. As I understand it, it basically abstracts models away so they can used with a few lines of code - which is actually perfect for my usecase (eeneral web demonstrator for different ML models). So I guess it makes sense, that there is stuff in the configs for compatibility reasons.
Hi @TheRealRolandDeschain this is all documented here for instance: https://huggingface.co/docs/transformers/model_doc/sam#transformers.SamMaskDecoderConfig.
@NielsRogge Thank you I will have a look through the linked documentation!
Hi, I am currently working on a UI Wrapper for the SAM models on huggingface. Basically, the wrapper lets the user choose images and parameters and then dynamically generates the
preprocessor_config.json
andconfig.json
files before running the model inference.This works fine with default parameters, however I am wondering if I can find documentation of the parameters in the jsons (short description, value limits, datatype)? This would be quite helpful!