Grok-1.5V code release - Githubissues

fabiopoiesi commented 5 months ago

Hi, when are you planning to release the source code of Grok-1.5V?

Thanks

pattang56892 commented 4 months ago

Does it matter? The amount of GPU power required to test the code for Grok is enormous. Even if the code is released, (just like Grok-1), there is no way you can test it on your local PC. I believe you need a subscription on X to test it.

fabiopoiesi commented 4 months ago

it does matter, so i can learn how vision and language are put into communication

pattang56892 commented 4 months ago

I’ve encountered some challenges while working with Grok. After downloading the weights (approximately 300GB) and setting it up in my IDE, my PC froze as soon as I ran the run.py script. Upon investigating the code, it appears that this LLM requires a platform with at least 8 GPUs (Linux/Unix). Given these requirements, it seems impractical for my current setup.

However, I can see how this can be achieved with Ollama. Ollama's capability to run on local drives allows the possibility of building a GUI with various Llama models. This can provide a user-friendly frontend interface, enabling users to interact with different models and serving as an effective learning platform. This, in my opinion is fantastic.

Given these constraints, however, how would you learn about the communication/connection between the GUI and backend in Grok when it cannot be implemented on local drives due to its high GPU requirements? Can you provide more details?

fabiopoiesi commented 3 months ago

I'm asking about 1.5V because I'm interested in the multimodal model: vision + language

On Fri, 17 May 2024, 15:45 Patrick T., @.***> wrote:

I’ve encountered some challenges while working with Grok. After downloading the weights (approximately 300GB) and setting it up in my IDE, my PC froze as soon as I ran the run.py script. Upon investigating the code, it appears that this LLM requires a platform with at least 8 GPUs (Linux/Unix). Given these requirements, it seems impractical for my current setup.

However, I can see how this can be achieved with Ollama. Ollama's capability to run on local drives allows the possibility of building a GUI with various Llama models. This can provide a user-friendly frontend interface, enabling users to interact with different models and serving as an effective learning platform. This, in my opinion is fantastic.

Given these constraints, however, how would you learn about the communication/connection between the GUI and backend in Grok when it cannot be implemented on local drives due to its high GPU requirements? Can you provide more details?

— Reply to this email directly, view it on GitHub https://github.com/xai-org/grok-1/issues/323#issuecomment-2117645412, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEBXWKAFGHIC4GMKAYUZZITZCYCY5AVCNFSM6AAAAABGGBBZUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJXGY2DKNBRGI . You are receiving this because you authored the thread.Message ID: @.***>

xai-org / grok-1

Grok-1.5V code release #323