Closed Jane-QinJ closed 1 year ago
Hi,
For params, we can count the amount layer by layer. We can use (Cin, Cout, K, K, K) for 3D convolution, and (Cin, Cout, K, K) for 2D convolution. It is not a hard job.
Here, a more simple method to count params is to directly check the file size of model dict (remember to remove unused keys in the checkpoint and only keep 'model_state'). The params is the file size divided by 4. This is because the model is usually stored by float32. 1M * 32 bit = 32Mb = 4MB and 1 Byte = 8 bit.
For runtime, you can use a timer to count the forward time in the network. For example, you can place a time.time() at the begin of forward function and the other time.time() at the end of it (before return). And count the average of all validation set.
I close this issue for now. Feel free to reopen if there are any other issues.
How can I see the remaining time for training?
I reproduced the results of the paper, by training PV-RCNN and Voxel R-CNN based multimodal models on the KITTI dataset. Meanwhile, I checked the number of parameters and found that it was much larger than that reported in the paper. After loading the provided checkpoints, I found that they do not include the layers from "backbone_3d.semseg.ifn.model". I do not understand why my models have these layers while the provided checkpoints do not. See how this affects the sizes:
MODEL | Whole | backbone_3d.semseg.ifn.model | Focals Conv |
---|---|---|---|
Voxel R-CNN multimodal multiclasses | 47.57M | 39.7M | 7.87M |
PV-RCNN multimodal multiclasses | 53.09M | 39.7M | 13.39M |
What are these layers and why are they not included in the provided checkpoints ?
import torch
from pcdet.config import cfg, cfg_from_yaml_file
from pcdet.datasets import build_dataloader
from pcdet.models import build_network
from pcdet.utils import common_utils
if __name__ == '__main__':
# Load model config (voxel_rcnn_multiclasses_focalsconv.yaml for Voxel R-CNN)
cfg_from_yaml_file('cfgs/kitti_models/pv_rcnn_focal_multimodal.yaml', cfg)
# Init logger
log_file = cfg.ROOT_DIR / 'output' / ('log_modelsize.txt')
logger = common_utils.create_logger(log_file, rank=cfg.LOCAL_RANK)
# Init dataset
test_set, test_loader, test_sampler = build_dataloader(
dataset_cfg=cfg.DATA_CONFIG,
class_names=cfg.CLASS_NAMES,
batch_size=1,
dist=False, workers=4,
logger=logger,
training=False,
merge_all_iters_to_one_epoch=False,
total_epochs=1
)
# Build model
model = build_network(model_cfg=cfg.MODEL,
num_class=len(cfg.CLASS_NAMES),
dataset=test_set)
# Init variables to store sizes
focalsconv_size = 0
ifn_size = 0
# Browse model layers
for name, param in model.state_dict().items():
num_params = torch.prod(torch.tensor(param.shape)).item()
if name.startswith("backbone_3d.semseg.ifn.model"):
ifn_size += num_params
else:
focalsconv_size += num_params
model_size = focalsconv_size + ifn_size
# Show sizes
print(f"Whole Size: {round((model_size)/10e5, 2)}M parameters")
print(f"...ifn.model Size: {round(ifn_size/10e5, 2)}M parameters")
print(f"Focals Conv Size: {round(focalsconv_size/10e5, 2)}M parameters")
I ran the script from the "tools" directory, you can run it elsewhere but you will need to update the path to the configuration file (.yaml).
Hi @ThomasPDM ,
After double checking, I realized what the problem is.
backbone_3d.semseg
is actually initialized from a 2D image segmentation model, DeepLabv3 with ResNet50 backbone.
You can see it from here. https://github.com/dvlab-research/FocalsConv/blob/2202be495b8a00af4a9a46e087e415f3f9533823/OpenPCDet/pcdet/models/backbones_3d/SemanticSeg/sem_deeplabv3.py#L158
However, what we actually used is only the first several layers in this model, "layer1" of this model. Other layers in the deeplabv3_resnet50
are not used in our model.
You can see this argument here. https://github.com/dvlab-research/FocalsConv/blob/2202be495b8a00af4a9a46e087e415f3f9533823/OpenPCDet/pcdet/models/backbones_3d/spconv_backbone_focal.py#L132
Thus, I deleted the other layers in the deeplabv3_resnet50
. That is why my pre-trained weight is smaller than yours.
Regards, Yukang Chen
Thanks for the quick answer. I do not understand how to remove these parameters from the whole model. I did not change anything except from the script that I added to build the model and display the number of parameters. To be sure, I just retried from the beginning using an Ubuntu 20.04 with Conda:
# Get the repository
git clone https://github.com/dvlab-research/FocalsConv.git
cd FocalsConv/OpenPCDet
# Setup conda environment
conda create -n focalsconv python==3.8
conda activate focalsconv
conda install pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.3 -c pytorch -c conda-forge
cd OpenPCDet
pip install -r requirements.txt
# Set env path into LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/path/to/conda/envs/focalsconv/lib:$LD_LIBRARY_PATH
# Install OpenPCDet
python setup.py develop --user
# Set access to the KITTI dataset from FocalsConv/OpenPCDet/data
ln -s /path/to/KITTI/data/training data/kitti/training
ln -s /path/to/KITTI/data/testing data/kitti/testing
ln -s /path/to/KITTI/data/ImageSets data/kitti/ImageSets
# Generate preprocessed data for OpenPCDet
python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml
# Create the output directory
mkdir output
# Copy and run the script (provided above)
cd tools
cp /path/to/the/size_script.py size_script.py
python size_script.py
The results are the same as above and when I load the checkpoints by adding this after building the model:
# Load checkpoints (for PV-RCNN here, but same kind of results with Voxel R-CNN)
ckpt = "/path/to/provided/checkpoints/pvrcnn_focal_multimodal.pth"
model.load_params_from_file(filename=ckpt, logger=logger, to_cpu=False)
model.cuda()
Then the weights for "backbone_3d.semseg.ifn.model" layers are not updated:
INFO ==> Checkpoint trained from version: pcdet+0.3.0+0000000
INFO Not updated weight backbone_3d.semseg.ifn.model.backbone.layer2.0.conv1.weight: torch.Size([128, 256, 1, 1])
...
INFO Not updated weight backbone_3d.semseg.ifn.model.classifier.4.bias: torch.Size([21])
INFO ==> Done (loaded 467/763)
How to build the model without these layers ? How did you manage to delete the other layers in the deeplabv3_resnet50 ? Do you iterate over the children of the model to create a copy without these layers ?
Regards
Hi,
For me, I change the def _segm_resnet
function in path_to/site-packages/torchvision/models/segmentation/segmentation.py
as below.
There are two modifications:
(1) return_layers = {'layer1': 'out'} #{'layer4': 'out'}
(2) classifier = None #model_map[name][0](inplanes, num_classes)
I know that this might not be the best way. You can also rewrite it in other ways.
Regards, Yukang Chen
Hi,
Instead of updating the library locally, I suggest to add a few lines of code in OpenPCDet/pcdet/models/backbones_3D/SemanticSeg/sem_deeplabv3.py
:
class SemDeepLabV3(SegTemplate):
def __init__(self, backbone_name, **kwargs):
...
super().__init__(constructor=constructor, **kwargs)
# We actually use only the first several layers in this model,
if backbone_name == "ResNet50":
backbone_blocks = ["conv1", "bn1", "relu", "maxpool", "layer1"]
new_model = next(self.model.children()) # keep the backbone
self.model = nn.Sequential() # reset the model
for name, module in new_model.named_children():
if name in backbone_blocks: # keep useful blocks
self.model.add_module(name=name, module=module)
Be careful, I think I either added or forgot a relevant block. In fact, this does not give exactly the same parameters as your suggested modification. Anyway, you may be inspired by this attempt. Also, in my opinion, you should update your code so that everyone can make the well-sized model (or at least, explain what you did in README.md).
Here is an update to the table that I provided above:
MODEL | Whole | backbone_3d.semseg.ifn.model | Focals Conv |
---|---|---|---|
Voxel R-CNN multimodal multiclasses | 8.10M | 0.23M | 7.87M |
PV-RCNN multimodal multiclasses | 13.62M | 0.23M | 13.39M |
Thanks for the answers,
Regards
Thanks for your remind. I will update this information in the README.md
Dear author, first of all, thanks for your great work. After reading your paper, I really want to know how to calculate the params and the runtime of adding Focals Conv to VoxelRCNN as u mentioned in your Experiments, and I want to try it, but I don't know how to do it, there's little information on the Internet, and after searching it in Google, I got confused. So I ask for your help if you have the code to accomplish it. I would very appreciate it if you could help me! Thank you in advance.