Zefan-Cai / PyramidKV

MIT License
171 stars 1 forks source link

Congratulation on the great work & hope for future versions #5

Open doptime opened 3 weeks ago

doptime commented 3 weeks ago

The higher the LLM layer, the more attention is focused on a few key tags. Therefore, if after a few iterations of the underlying op, it is possible to detach from the underlying op and use only the high-layer loop. Then the solver can be expressed as a question-specific LLM. This idea aligns closely with the ultimate computing arch I envisioned in dreams. With the same computing power, at a higher lvl of abstraction, cognition can be improved by 3-6 OOMs, eq to 40 yrs of Moore’s Law. The PyramidKV marks the beginning of vast potential for innovation in this field.

This computing arch has appeared in my dreams many times, and I feel it’s becoming true. It involves generating a general question-specific graph and training a very small network in real time.

Inspired by human cognitive processes, I believe it’s overlooked to emphasize problem-solving at a conceptual lvl rather than repetitive, low-level processing.

doptime commented 3 weeks ago

You are on a snowball track with the thickest snow and the longest slope. What an enviable thing!

doptime commented 1 week ago

从信息与统计视角看LLM 的出路 a) 对我们这个世界的改造能动性,来自一种后验洞察,这种后验必须立足当下的一切,系统内外的条件。这意味着对于求解器而言,一来条件是海量的,二来推理是深度的。所以这种情况下,求解器绝无可能可以把后验求解器看做是先验知识。深度的推理练习可以缓解这一点,但绝无近零推理的可能。 b) 这种后验求解器优化的难度在于条件的稀疏性。更在于稀疏性组合的巨量可能性。这在赫伯特·西蒙 有限理性中有比较好的展示。稀疏使得认知不得不总具备新鲜的可能。就像传统工程学中,优化绝不是事先可以规划的东西,但它一定是事后的魔法。看看还能做什么而推动和优化的程度常常让我们震惊。同时关系的稀疏性意味着细致的认知和评估通常是可能得。 结论如果我们追求一个非常强大的artificial super intelligence,那么更大很可能不是大模型的必须,而更深的推理训练是必须。更深的推理以为着必须引入一种高效得多的,推理深度深得多的,支持long thinking 的架构。