mauriceqch / pcc_geo_cnn

Learning Convolutional Transforms for Point Cloud Geometry Compression
MIT License
46 stars 12 forks source link

some ideas on improvement #6

Open yansir-X opened 4 years ago

yansir-X commented 4 years ago

Hello Maurice, i'm a newbie to deep learning. I studied your paper and am trying to improve on it as a small project. If possible, i would appreciate it to hear from your on some ideas to improve on your work.

Your Neural Network Architecture consists of a 3-layers conv, then quantization, then 3-layer deconv. Do you think the following appoaches make sense? -add batch normalization,layer -add dropout layer -make the model deeper, ie., add more layers -perhaps try ResNet.

Or do you have some other suggestions?

Like said, i'm a newbie to this field. Please don't hesitate to express your thought directly or criticise.

Thanks in advance! Best Regards

mauriceqch commented 4 years ago

Hello,

Not sure about batch normalization and dropout. Making the model deeper and ResNet like architectures improves performance significantly.

This list may not be complete but it gives a good idea of what can be done on PCC using deep learning approaches.

  1. D. Tang et al., ‘Deep Implicit Volume Compression’, arXiv:2005.08877 [cs, eess], May 2020, Accessed: May 20, 2020. [Online]. Available: http://arxiv.org/abs/2005.08877.
  2. L. Huang, S. Wang, K. Wong, J. Liu, and R. Urtasun, ‘OctSqueeze: Octree-Structured Entropy Model for LiDAR Compression’, arXiv:2005.07178 [cs, eess], May 2020, Accessed: May 20, 2020. [Online]. Available: http://arxiv.org/abs/2005.07178.
  3. M. Quach, G. Valenzise, and F. Dufaux, ‘Folding-based compression of point cloud attributes’, in arXiv:2002.04439 [cs, eess, stat], Feb. 2020, Accessed: Feb. 14, 2020. [Online]. Available: http://arxiv.org/abs/2002.04439.
  4. A. F. R. Guarda, N. M. M. Rodrigues, and F. Pereira, ‘Point Cloud Coding: Adopting a Deep Learning-based Approach’, in 2019 Picture Coding Symposium (PCS), Nov. 2019, pp. 1–5, doi: 10.1109/PCS48520.2019.8954537.
  5. J. Wang, H. Zhu, Z. Ma, T. Chen, H. Liu, and Q. Shen, ‘Learned Point Cloud Geometry Compression’, arXiv:1909.12037 [cs, eess], Sep. 2019, Accessed: Sep. 30, 2019. [Online]. Available: http://arxiv.org/abs/1909.12037.
  6. W. Yan, Y. shao, S. Liu, T. H. Li, Z. Li, and G. Li, ‘Deep AutoEncoder-based Lossy Geometry Compression for Point Clouds’, arXiv:1905.03691 [cs, eess], Apr. 2019, Accessed: Sep. 02, 2019. [Online]. Available: http://arxiv.org/abs/1905.03691.

About PCC, this paper gives an excellent overview on the work done by MPEG on G-PCC and V-PCC standards:

  1. S. Schwarz et al., ‘Emerging MPEG Standards for Point Cloud Compression’, IEEE Journal on Emerging and Selected Topics in Circuits and Systems, pp. 1–1, 2018, doi: 10.1109/JETCAS.2018.2885981.

Also, work on deep learning for image compression is very relevant to the field so you might want to check it out.

  1. J. Ballé, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, ‘Variational image compression with a scale hyperprior’, arXiv:1802.01436 [cs, eess, math], Jan. 2018, Accessed: Jan. 18, 2019. [Online]. Available: http://arxiv.org/abs/1802.01436.
  2. J. Ballé, V. Laparra, and E. P. Simoncelli, ‘End-to-end Optimized Image Compression’, in 2017 International Conference on Learning Representations, 2017, Accessed: Oct. 31, 2018. [Online]. Available: http://arxiv.org/abs/1611.01704.

And when doing compression, it is also necessary to consider quality assessment.

  1. G. Meynet, Y. Nehmé, J. Digne, and G. Lavoué, ‘PCQM: A Full-Reference Quality Metric for Colored 3D Point Clouds’, in 12th International Conference on Quality of Multimedia Experience (QoMEX 2020), Athlone, Ireland, May 2020, Accessed: May 23, 2020. [Online]. Available: https://hal.archives-ouvertes.fr/hal-02529668.
  2. G. Meynet, J. Digne, and G. Lavoué, ‘PC-MSDM: A quality metric for 3D point clouds’, in 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), Jun. 2019, pp. 1–3, doi: 10.1109/QoMEX.2019.8743313.
  3. E. Alexiou, I. Viola, T. M. Borges, T. A. Fonseca, R. L. de Queiroz, and T. Ebrahimi, ‘A comprehensive study of the rate-distortion performance in MPEG point cloud compression’, APSIPA Transactions on Signal and Information Processing, vol. 8, ed 2019, doi: 10.1017/ATSIP.2019.20.
  4. E. M. Torlig, E. Alexiou, T. A. Fonseca, R. L. de Queiroz, and T. Ebrahimi, ‘A novel methodology for quality assessment of voxelized point clouds’, in Applications of Digital Image Processing XLI, Sep. 2018, vol. 10752, p. 107520I, doi: 10.1117/12.2322741.
  5. D. Tian, H. Ochimizu, C. Feng, R. Cohen, and A. Vetro, ‘Geometric distortion metrics for point cloud compression’, in 2017 IEEE International Conference on Image Processing (ICIP), Beijing, Sep. 2017, pp. 3460–3464, doi: 10.1109/ICIP.2017.8296925.

Best,

yansir-X commented 4 years ago

Again, thanks for your detailed answer!

I have another question regarding decompression.dy

As you mentioned, it's very slow using GPU to run this script. I experimented, and it takes rouphly 2 minutes to decompress and generate a single .ply.bin.ply file. But when I switch to CPU, then is the error: Conv3DBackpropInputOpV2 only supports NDHWC on the CPU, like mentioned in another thread. So is there a solution to this dilemma? I'm curious as to how you run it. After all, it's tausends of files to decompress and both CPU and GPU don't seem to work well.

Thanks again! Best

mauriceqch commented 4 years ago

In decompression.py, there is a performance issue at high resolutions which is documented here tensorflow/tensorflow#25760 .

On my configuration, I compiled tensorflow from source with Intel MKL support (https://www.tensorflow.org/install/source#configure_the_build) which enables channels first on the CPU. Compiling Tensorflow with MKL support or dividing the point cloud into blocks should help as the slowness issue is only present for resolutions greater than 512 in my experience.