modularml / mojo

The Mojo Programming Language
https://docs.modular.com/mojo/manual/
Other
23.27k stars 2.59k forks source link

[BUG]: Mojo is not as fast, as it could be (when printing to stdout is involved) #975

Closed gryznar closed 2 weeks ago

gryznar commented 1 year ago

Bug description

This video (finding prime numbers): https://www.youtube.com/watch?v=D4xX3aS5JjE&pp=ygUTbW9qbyB2cyBnbyB2cyBjb2Rvbg%3D%3D shows that Mojo is fast, but not as fast as other languages like Go or even Codon (compiled Python implementation).

I have performed some benchmarks using similar code as in video:

from time import now

fn is_prime(n: Int) -> Bool:
    if n <= 1:
        return False
    if n <= 3:
        return True
    if n%2 == 0 or n % 3 == 0:
        return False

    var i: Int = 5
    while i * i <= n:
        if n% i == 0 or n% (i+2) ==0:
            return False
        i += 6
    return True

fn find_primes_before_n(n: Int) -> Int:
    var total: Int = 0
    for i in range(2, n):
        if is_prime(i):
            total += 1
    return total

fn main():
    let num = 10_000_000
    let start_time: Int = now()
    let count: Int = find_primes_before_n(num)
    let elapsed_time: Float64 = (now() - start_time) / 1e9
    print('Mojo took: ', elapsed_time, 's')
    print('total primes before', num, ': ', count)
package main

import (
    "fmt"
    "time"
)

func isPrime(n int) bool {
    if n <= 1 {
        return false
    }
    if n <= 3 {
        return true
    }
    if n%2 == 0 || n%3 == 0 {
        return false

    }

    i := 5
    for i*i <= n {
        if n%i == 0 || n%(i+2) == 0 {
            return false
        }
        i += 6
    }

    return true
}

func findPrimesBeforeN(n int) int {
    total := 0
    for i := 2; i < n; i++ {
        if isPrime(i) {
            total += 1
        }
    }
    return total
}

func main() {
    num := 10_000_000
    startTime := time.Now()
    count := findPrimesBeforeN(num)
    endTime := time.Now()
    elapsedTime := endTime.Sub(startTime)
    fmt.Println("Go took: ", elapsedTime.Seconds(), "s")
    fmt.Println("total primes before", num, ": ", count)
}

Results (Go and Mojo compiled AoT):

Mojo took:  3.7724631670000002 s
total primes before 10000000 :  664579
Go took:  8.895780889 s
total primes before 10000000 :  664579

As you can see in pure computations Mojo is taking 2x less time.

With adding prints: to Mojo:

fn find_primes_before_n(n: Int) -> Int:
    var total: Int = 0
    for i in range(2, n):
        if is_prime(i):
            print(i)
            total += 1
    return total

to Go:

func findPrimesBeforeN(n int) int {
    total := 0
    for i := 2; i < n; i++ {
        if isPrime(i) {
            fmt.Println(i)
            total += 1
        }
    }
    return total
}

Results (Go and Mojo compiled AoT):

Mojo took:  26.733618268000001 s
total primes before 10000000 :  664579
Go took:  27.621433634 s
total primes before 10000000 :  664579

Mojo is loosing its performance benefits, when stdout is involved. Tested on:

Ubunt 22.10 (VirtualBox)
go1.21.1
Mojo 0.3.1
ksandvik commented 1 year ago

Any benchmark numbers?

gryznar commented 1 year ago

I have attached my benchmarks result in issue.

abduld commented 1 year ago

Thanks @gryznar for reporting. We are aware of this issue and will take steps to resolve this ASAP

gryznar commented 1 year ago

@abduld another thing to add here (more important than stdout). In some cases Mojo jitted is much slower than python. For example in sth like that:

fn main():
    print("sth1")
    print('sth2')
    ...
    print('sth1000')

This script compiled AoT takes 0.04s to run, similar code takes in Python 0.08s, but via:

mojo run script.mojo

it takes over 2s, where 2s takes to compile that.

abduld commented 1 year ago

Thanks @gryznar this is good to know. I believe the fix for the above will make compilation faster as well. Let's take one thing at a time ;)

gryznar commented 1 year ago

I hope so, cause compilation time here is alarming

Edit: @abduld Yet another thing, but may be very useful. As it was mentioned here: https://discord.com/channels/1087530497313357884/1098713601386233997/1159447998489710643

in this video: https://youtu.be/kgUXfDpAmGQ?si=ddJZ7Po7xRU6lI7i speakers's compiler can compile 3M lines of code/s. Maybe some tricks introduced there may be ported to Mojo? With such speed startup problems may fade into oblivion 😁

gryznar commented 11 months ago

@abduld when you will manage to fix this,  could you also look at that: https://github.com/modularml/mojo/issues/1216 ? This is another String related problem, in this case connected with allocations. Root cause may be similar

ConnorGray commented 6 months ago

FYI to anyone interested in improved formatting performance, while we don't have buffered IO yet, I recently implemented a Formatter abstraction for Mojo that avoids needing intermediate String allocations when trying to format/print a type, see this recent commit:

[stdlib] feature: Add non-allocating formatting abstraction

which includes more information and examples.

uberkael commented 2 months ago

Is there an alternative method for printing? Is it possible to use the Python print function?

jackos commented 2 weeks ago

Resolved by https://github.com/modularml/mojo/commit/9cbfa411d1ea0f01d0b7c1732cc4e36f08c1f969