[Question] GPU vs Metal performance & Seeding models

aramcheck commented 7 months ago

Hello,

This is not an issue, hopefully it is ok to ask here.

I ran some tests using llama.cpp on Apple Silicon (Macbook Air M1) and NVIDIA Quadro M4000. Using the same model orca-2-7b.Q4_0.gguf I got much better performance on the M1. Concretely:

llama on M1

llama_print_timings:        load time =    1035.74 ms
llama_print_timings:      sample time =     166.85 ms /   216 runs   (    0.77 ms per token,  1294.54 tokens per second)
llama_print_timings: prompt eval time =     808.49 ms /    84 tokens (    9.62 ms per token,   103.90 tokens per second)
llama_print_timings:        eval time =   15542.42 ms /   215 runs   (   72.29 ms per token,    13.83 tokens per second)
llama_print_timings:       total time =   16641.54 ms

llama with GPU

llama_print_timings:        load time =     725.88 ms
llama_print_timings:      sample time =     135.49 ms /   244 runs   (    0.56 ms per token,  1800.86 tokens per second)
llama_print_timings: prompt eval time =    2047.53 ms /    84 tokens (   24.38 ms per token,    41.02 tokens per second)
llama_print_timings:        eval time =   52313.11 ms /   243 runs   (  215.28 ms per token,     4.65 tokens per second)
llama_print_timings:       total time =   54576.09 ms

The other aspect that caught my attention is that seeding both models the same seed, yield to different results in different architectures. Although the completions look somehow very similar in structure.

I don't have much understanding about how the seed is implemented, but I wanted to ask if both observations are expected based on your experience?

This is the prompt I used:

<|im_start|>system
You are Orca, an AI language model created by Microsoft. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.<|im_end|>
<|im_start|>user
What are the main challenges in higher Education once AGI (Artificial General Intelligence) is achieved?<|im_end|>
<|im_start|>assistant

And both responses:

Metal There are different possible scenarios for how AGI could affect higher education, depending on how it is developed, implemented, and regulated. Some of the main challenges that could arise are:

How to adapt curricula, pedagogy, and assessment methods to incorporate or interact with AGI systems, while ensuring quality, relevance, and ethical standards.
How to balance the benefits and risks of AGI for students, teachers, researchers, and society, and how to foster responsible and critical use of AGI technologies.
How to protect the privacy, autonomy, and identity of individuals and groups who interact with AGI systems, and how to prevent or mitigate potential harms, biases, or abuses of AGI data and algorithms.
How to ensure the equitable distribution and access of AGI resources and opportunities among different regions, countries, cultures, and populations, and how to address the digital divide and the social implications of AGI. [end of text]

GPU There are different possible scenarios for how AGI might affect higher education, depending on how it is developed, deployed, and regulated. However, some of the main challenges that could be encountered are:

How to design and deliver curricula that are relevant, engaging, and meaningful for AGI learners, who may have different goals, preferences, and abilities than human learners.
How to assess and accredit AGI learners, who may not follow the same standards of measurement, evaluation, and verification as human learners.
How to maintain academic integrity and quality in higher education, especially when AGI can access and process vast amounts of information and knowledge that human educators may not be able to keep up with or verify.
How to balance the benefits and risks of using AGI for teaching, learning, research, and administration, such as privacy, security, ethics, accountability, and social impact.
How to adapt and innovate the higher education systems and institutions to accommodate and leverage the potential of AGI, while preserving their core values, missions, and identities. [end of text]

github-actions[bot] commented 4 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 3 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.

ggerganov / llama.cpp

[Question] GPU vs Metal performance & Seeding models #4384