Jumpat / SegAnyGAussians

The official implementation of SAGA (Segment Any 3D GAussians)
Apache License 2.0
595 stars 42 forks source link

extract_features.py文件只在第一版中提供 #77

Open Garfield9986 opened 5 months ago

Garfield9986 commented 5 months ago

v2版本中extract_features.py文件只在第一版中提供,然而在v2版本还是必须的,并且没有--downsample参数。 另外,请问在那里可以下载能运行get_clip_features.py文件的clip模型呀?(比如ViT-B-16-laion2b_s34b_b88k.bin)

Jumpat commented 5 months ago

你好, v2版本是不需要这个脚本的。 这个clip模型是openclip lib中自动下载的。

Garfield9986 commented 5 months ago

但是不用这个文件提取feature好像会出错,我看我的sam_masks文件夹下没有生成的IMG_4026.pt文件,只有 image000.pt-image019.pt文件

Looking for config file in /home/yang/3DRC/SegAnyGAussians/output/8ae9df13-1/cfg_args Config file found: /home/yang/3DRC/SegAnyGAussians/output/8ae9df13-1/cfg_args Loading trained model at iteration 30000, None Allow Camera Principle Point Shift: False Reading camera 20/20 Loading Training Cameras Loading Test Cameras 0it [00:00, ?it/s] Traceback (most recent call last): File "get_scale.py", line 104, in masks = torch.load(os.path.join(os.path.join(dataset.source_path, 'sam_masks'), image_path.replace('jpg', 'pt').replace('JPG', 'pt').replace('png', 'pt'))) File "/home/yang/Anaconda/yas/envs/SAGA/lib/python3.7/site-packages/torch/serialization.py", line 699, in load with _open_file_like(f, 'rb') as opened_file: File "/home/yang/Anaconda/yas/envs/SAGA/lib/python3.7/site-packages/torch/serialization.py", line 230, in _open_file_like return _open_file(name_or_buffer, mode) File "/home/yang/Anaconda/yas/envs/SAGA/lib/python3.7/site-packages/torch/serialization.py", line 211, in init super(_open_file, self).init(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: '/home/yang/3DRC/SegAnyGAussians/data/nerf_llff_data/fern/sam_masks/IMG_4026.pt'

Jumpat commented 5 months ago

sam masks下的文件是使用extact sam masks 脚本进行的。 这里的错误是由于downsample处理后的文件名和原始文件名没有对应导致的,需要建立原始文件名(即IMG_4026)到降采样后文件名(imagexxx)的映射。这里一般是按照数字大小自动排序的,因此sort一下然后对应起文件名就可以了。 或者你可以在提取sam mask的时候加上参数:--downsample_type mask 这会阻止脚本使用提前降采样准备好的文件名未对齐的文件进行掩膜提取,转而使用原始分辨率图像进行提取并随后进行掩膜降采样。请注意,这可能会导致显存不足问题。

Garfield9986 commented 5 months ago

非常感谢你 @Jumpat 快速的回答,但是我还是有一点没有太理解,你的意思是说,在同一downsample值下,我只需要将image000.pt-image019.pt文件按顺序更名为IMG_4026.pt-IMG_4045.pt文件就可以当作feature使用了吗?我试过使用--downsample_type mask方法,显存确实不够用了。

Jumpat commented 5 months ago

我只需要将image000.pt-image019.pt文件按顺序更名为IMG_4026.pt-IMG_4045.pt文件就可以当作feature使用了吗?

是的,确保图像是对应的

Garfield9986 commented 5 months ago

是在不好意思,之前的问题解决了,但是当我继续运行时,会出现这样的错误: 当我使用train_scene.py训练原始3dgs,并且cfg_args中resolution=8时

Looking for config file in /home/yang/3DRC/SegAnyGAussians/output/03bc5f2a-f/cfg_args Config file found: /home/yang/3DRC/SegAnyGAussians/output/03bc5f2a-f/cfg_args Loading trained model at iteration 30000, None Allow Camera Principle Point Shift: False Reading camera 20/20 Loading Training Cameras Loading Test Cameras 20it [00:01, 17.84it/s] 0it [00:00, ?it/s] Traceback (most recent call last): File "get_scale.py", line 142, in points_in_3D[:,:,0] = (grid_index[:,:,0] - cx) * depth / fx RuntimeError: The size of tensor a (378) must match the size of tensor b (504) at non-singleton dimension 1

当我使用train_scene.py训练原始3dgs,并且cfg_args中resolution=-1时

Looking for config file in /home/yang/3DRC/SegAnyGAussians/output/8ae9df13-1/cfg_args Config file found: /home/yang/3DRC/SegAnyGAussians/output/8ae9df13-1/cfg_args Loading trained model at iteration 30000, None Allow Camera Principle Point Shift: False Reading camera 20/20 Loading Training Cameras Loading Test Cameras 20it [00:01, 17.79it/s] 0it [00:00, ?it/s] Traceback (most recent call last): File "get_scale.py", line 142, in points_in_3D[:,:,0] = (grid_index[:,:,0] - cx) * depth / fx RuntimeError: The size of tensor a (1200) must match the size of tensor b (1600) at non-singleton dimension 1

不知道您有没有遇到过这样的问题QAQ

Jumpat commented 4 months ago

看上去是H和W反了?我们没有遇到过这种问题,请多提供一些程序中间变量的输出结果,例如张量尺寸等帮助定位问题

Garfield9986 commented 4 months ago

实在不好意思,感谢您百忙之中的回复,之前可能是因为我环境中某个库配置出了问题,导致我在generate_grid_index函数中的某行代码里添加了一个参数,这导致我的行列被转换了,现在这个问题解决了,非常感谢!!!

huziyin911 commented 1 day ago

您好,麻烦问一下,我先执行python train_scene.py -s ,再执行python extract_segment_everything_masks.py --image_root --sam_checkpoint_path --downsample <1/2/4/8> python get_scale.py --image_root --model_path <path to the pre-trained 3DGS model>,再执行python get_clip_features.py --image_root 是正确的步骤吗