rafiibnsultan / GeoSAM

Fine-tuning SAM with Multi-Modal Prompts for Mobility Infrastructure Segmentation
https://sites.google.com/view/mlpa/mainpage
MIT License
55 stars 2 forks source link

Please share mod_cls_txt_encoding.pth #6

Open gaurav14u opened 2 months ago

gaurav14u commented 2 months ago

Hi Rafi, Thank you for sharing the code. I was trying to run your notebook which has a reference to
mod_cls_txt_encoding = torch.load("/home/rafi/GeoSAM/mod_cls_txt_encoding.pth").to(device) Could you please share it?

rafiibnsultan commented 2 months ago

Hi, it's already shared in the repository: https://github.com/rafiibnsultan/GeoSAM/blob/GeoSAM_with_text/mod_cls_txt_encoding.pth

rafiibnsultan commented 2 months ago

You can also create your own:

import os import clip import torch

device = "cuda:0" if torch.cuda.is_available() else "cpu" model, preprocess = clip.load('ViT-B/32', device)

classes = ["Sidewalk and crosswalk", "Roads"] details = ["linear paths, zebra stripes.", "paved surfaces, vehicle lanes."]

sentences = [f'{item}: {detail}' for item, detail in zip(classes, details)] for sentence in sentences: print(sentence)

txt_encoding = [] with torch.no_grad():

text_inputs = torch.cat([clip.tokenize(f'{item}: {detail}') for item, detail in zip(classes, details)]).to(device)
text_features = model.encode_text(text_inputs)
text_features /= text_features.norm(dim=-1, keepdim=True)

print(text_features.shape, text_features.dtype)
txt_encoding.append(text_features)

mod_cls_txt_encoding = torch.stack(txt_encoding) print(mod_cls_txt_encoding.shape) torch.save(mod_cls_txt_encoding, 'mod_cls_txt_encoding.pth')