Following issue 17 - Githubissues

HuGhcat-code commented 3 months ago

Dear author: Unfortunately，the advice you gave me doesn't help: After clicking "Semantic" or "Relevancy" button at the top of the setting box,I got absolutely nothing but a black background as follows: Same situation happened when I click "Relevancy" button. However there are indeed results for RGB and Depth button： I wanna know why I can get nothing more than a completely black background for the "Semantic" or "Relevancy" button,and how can I solve this! Thanks a lot!

sharinka0715 commented 3 months ago

It seems like you did not input the right prompt. You should input what class you want to seg at the top right "Text prompt" box, and click "Apply text prompt" button.

If you still cannot get proper results, I would like to suggest you to try OpenSeg or LSeg on larger scenes rather than MVImgNet, to get familiar with this Viser visualization interface.

HuGhcat-code commented 3 months ago

I noticed that some code in the fusion.py file is commented out; that should be the reason. But I am still trying to resolve this issue.

HuGhcat-code commented 3 months ago

Could you please tell me how to realize the Part Segmentation Result in the paper?

sharinka0715 commented 3 months ago

I noticed that some code in the fusion.py file is commented out; that should be the reason. But I am still trying to resolve this issue.

I don't think any commented code would affect the projection result. The commented code are for debug use, which can save semantic maps during fusion.

sharinka0715 commented 3 months ago

Could you please tell me how to realize the Part Segmentation Result in the paper?

I have no idea about the reason why you cannot reproduce the result in our paper. Please make sure that:

Your 2D pretrained model is VLPart, and your target part is in predefined_classes.
You run fusion.py and get the 2D projection checkpoint 0.pt, and you load it in view_viser.py by setting render.fusion_dir in config.yaml.
Input your target class and part (e.g.: mug:handle) in the Text Prompt box, and click Apply Text Prompt, then click Relevancy or Semantic to view semantic maps.

I would like to upload a part segmentation sample including a COLMAP scene, an RGB 3DGS, and a fusion checkpoint. You can try it with text prompt guitar:fingerboard,guitar:bridge,guitar:hole,guitar:body,guitar:headstock.

The download link is at https://drive.google.com/file/d/1EdmktauTcj5OgMqxoE81idKyqgaHESju/view?usp=drive_link

The result is as follows:

Please carefully check your code and process.

HuGhcat-code commented 3 months ago

i know the reason: At the end of the fusion.py script, there is a line that says: # model_2d.set_predefined_cls(SCANNET20_CLASS_LABELS) we should comment it out if we are not using SCANNET

HuGhcat-code commented 3 months ago

Dear author: the download link you gave me has a 'scene' foder which contains the following: I would like to know which folder you use as a scene to train 3DGS? I simply used 'images' and 'sparse' folder to train 3DGS(neglecting the other 2 folders) After training,i ran fusion.py and view_viser.py to get a result far worse than you provided as below: (with the same prompt:guitar:fingerboard,guitar:bridge,guitar:hole,guitar:body,guitar:headstock) the image above is what i get retraining 3DGS, below is simply running view_viser.py after configing(using fusion.pt and 3DGS model you provided) I want to know the reason why big differences exit between those 2 pics and how can i refine the results of my own training? Thanks a lot

HuGhcat-code commented 3 months ago

Furthermore, I d like to know how you come up with the text prompt: guitar:fingerboard,guitar:bridge,guitar:hole,guitar:body,guitar:headstock seeing that all parts of guitar are not used in 'LVIS_PACO_VOCAB':

sharinka0715 commented 3 months ago

Hi,

Training RGB Gaussians only needs images and sparse folders.
If you get worse results, you can try to save intermediate results that predicted by VLPart (a probable code example). Semantic Gaussians can only project embeddings from pretrained models like VLPart, so if the foundation model cannot get the right result, we cannot get it either.
Not all parts are used is also because of the performance of VLPart. VLPart cannot tell all parts accurately. Besides, some parts overlaps (key and headstock, side and body, string and fingerboard), and some parts do not exist in the scene (pickguard, back).

HuGhcat-code commented 3 months ago

Thanks a lot! However,I noticed that：For the same guitar scene, my 3DGS rendering results are much worse than yours,as shown in figures below: I want to know why my rendering looks so blurry and which parts of the code I need to adjust to improve the results I am currently using an NVIDIA RTX 3090 with single GPU training(cuda:0) Thanks a lot

HuGhcat-code commented 3 months ago

I d like to add that: the configuration i use in the webpage is maybe the same as yours: (I am not entirely sure about the resolution scale you use,mine is just as the default)

sharinka0715 commented 3 months ago

The blurry scene is related to the resolution scale. The default resolution is too low to render on fullscreen (approximately 480x640). You can change resolution scale to 4x. I render it on NVIDIA RTX 4090.

HuGhcat-code commented 3 months ago

Hi: After clicking the '4x' button,the following interface is displayed: It seems that there is an issue with the connection? It just keeps connecting and cannot connect,I dont know why So I am having trouble getting the high-resolution rendering result

HuGhcat-code commented 3 months ago

Do you think it is due to the graphics card model? I connect to a remote server via SSH（with NVIDIA RTX 3090）, but my local Windows system does not have a GPU. I’m not sure if that’s the reason.

sharinka0715 commented 3 months ago

I think it is not related to your client machine. The problem might be on your server.

Have you checked that the program on your server? The connection lost often results from the exception from the server.

If the server runs without exception but you cannot connect to the server, I will have no idea towards that.

sharinka0715 / semantic-gaussians

Following issue 17 #18