Open Twitte0000 opened 7 months ago
Hi, although parallel connectivity has been studied in previous research, there is a gap in people's thinking about parallel connectivity on Mamba. Our work bridges the gap on parallel Mamba. Simply connecting Mamba in parallel does not lead to performance degradation, but instead improves in terms of parameters, computational complexity, performance, and more. Moreover, parallelizing Mamba is just a starting point and inspiration, which opens another line of research to the reader. Any future variant of Mamba can be used in parallel. Stay tuned for our latest arXiv release in the next few days, where we have explored parallelization of several Mamba variants with exciting results! As if the classic ResNet network, which uses only simple residual connections, has gained widespread attention. The proposed parallel Mamba is not a structure, but a way of thinking, and if you think about it that way, I'm sure our research will be very inspiring to you. I hope my answer can help you.
Moreover, you did not provide the experimental results of repeated training. The results of your model that I tested were very poor. Based on the average of three experiments, this made me have great doubts about your experimental results and the exaggerated title of the paper.
Parallel processing and feature segmentation methods are well-established in the field of deep learning, particularly in network architectures such as those described in the Inception series. If the primary innovation of the PVM Layer is merely an adjustment in the structure or parameters of the parallel blocks, it raises concerns about the novelty and substantial contribution of the thesis.