Closed yun189 closed 2 months ago
Hi @yun189 , thank you for your interest in SpatialBot.
In training, if the model is asked about depth value, we design two QA types: (a) Directly answer. So the model need to get depth information from depth map input directly (b) Call Depth API. e.g. The depth value of point(0.51,0.40) is Depth(0.51,0.40). In the following conversation: API: Depth(0.51,0.40)=...; Model: Use the depth value to do following tasks
In evaluation, if you want the model to directly use depth map, you can prompt with 'Answer directly with depth map'. If not, the model will choose by itself whether to use API or not. If it uses DepthAPI, you only need to tell it the corresponding depth value. Sample codes:
def extract_depth_api(response):
pattern = r"Depth\(([\d.]+),\s*([\d.]+)\)"
matches = re.findall(pattern, response)
if matches:
results = [(float(match[0]), float(match[1])) for match in matches]
return results
else:
return None
response = call_model_engine(args, sample, model, tokenizer, processor) response = str(response) depth_api_values = extract_depth_api(response)
if depth_api_values is not None:
sample['question_2'] = ''
sample['response_1'] = response
for depth_api_value in depth_api_values:
x, y = depth_api_value[0],depth_api_value[1]
depth_map = Image.open(...)
width, height = depth_map.size
depth_map = np.array(depth_map)
pixel_x = max(int(x width) - 1, 0)
pixel_y = max(int(y height) - 1, 0)
depth_value = depth_map[pixel_y, pixel_x]
sample['question_2'] = sample['question_2']+'Depth('+str(x)+','+str(y)+')='+str(depth_value)+', '
sample['question_2'] = sample['question_2'].strip(', ')
response_after_api = call_model_engine(args, sample, model, tokenizer, processor)
So the conversation may look like:
User: What is the depth value of object: elephant? SpatialBot: The elephant corresponds to a bounding box of [0.15,0.70,0.25,0.0.80], so it corresponds to Depth(0.2,0.75). User(API): Depth(0.2,0.75) = 100 SpatialBot: The depth value of elephant is 100.
I'll close this issue since no further questions are raised for a week. Feel free to reopen it if you still have concerns about our work.
I cannot find code about DepthAPI in project.