3dv-casia / BWformer

BWformer: 3D Building Wireframe Reconstruction from 2D Height Map with Transformer
17 stars 0 forks source link

Clarifications on model #1

Closed peterrickwood closed 4 months ago

peterrickwood commented 4 months ago

Thank you for developing this model. After reading the description I have a few questions, would be grateful if you could clarify about the corner queries.

You say that they are dimension (MN, 3). What are M and N? How are the the detected 2D reference points used to initialize the X and Y components of the queries, and how are the Z components initialized? An example would be great: suppose you predict 4 2D reference points.... so you have a (4,2) tensor of x,y values [[0,0],[0,10.0],[10.0,10.0],[10.0,0]]. What are the corner queries that would result from this?

Also, you do not mention what losses you use. It seems clear from your talk that you do something like cross-entropy loss for the 2D vertex prediction, but I'm not clear on the later losses (for the predicted 3D corners, and the final edges). Can you provide any details? Thanks!

PaulLiuYZ commented 4 months ago

I'm sorry for the simple abstract in the github, indeed, a more detailed technical report has been submitted to the workshop but I don't know when it will be released. M is the number of 2D corners and N is the maximum number of corners sharing the same XY coordinates. XY coordinates are directly obtained from the 2D detected corners and Z components are randomly initialized. As an example, the corner queries will be something like [[0,0, $Z{11}$]...,[[0,0, $Z{1N}$],[0,10.0, $Z{21}$],...,[[0,0, $Z{2N}$],[10.0,10.0, $Z{31}$],...,[10.0,10.0, $Z{3N}$],[10.0,0, $Z{41}$]],...[10.0,0, $Z{4N}$]]. $Z_{mn}$ means nth corner with mth 2d position. As for the loss, 3D corner loss is L1 loss, and edge loss is a cross-entropy loss. I hope my reply can answer your questions and I will release an arxiv paper soon with all the details and update it in the github as well, so you can star our code and keep tuned!