safe task in channel - Githubissues

ddkwork commented 1 year ago

Is the model safe for channel? When I drive 500 channels to perform a text2text task, each channel gets stuck, and the memory usage skyrockets, fluctuating from 4GB to 12GB.

matteo-grella commented 1 year ago

Hi, I’am going to release a major update updating Cybertron to the last main branch of Spago that solves a lot of open issues on memory handling. Expect that by the end of next week. That said, what model are you using? Can you post here a sample of your attempt using channels?

ddkwork commented 1 year ago

Hi, I’am going to release a major update updating Cybertron to the last main branch of Spago that solves a lot of open issues on memory handling. Expect that by the end of next week. That said, what model are you using? Can you post here a sample of your attempt using channels?

Hi, i tested this demo:

https://github.com/nlpodyssey/cybertron/blob/main/examples/textgeneration/main.go

Model is "Helsinki-NLP/opus-mt-en-zh"

let me try ants lib.

chineseComment := translate.Translate(// Use the pool with a method, // set 10 to the capacity of goroutine pool and 1 second for expired duration.) //chineseComment := translate.Translate(Use the pool with a method)

Multiple lines of text cannot be translated, resulting in a memory leak

matteo-grella commented 9 months ago

Hi @ddkwork would you mind to give it a try to the latest version (v0.2.0)?

ddkwork commented 9 months ago

Hi @ddkwork would you mind to give it a try to the latest version (v0.2.0)?

falid load model with 0.2 6GIN2~TW0X48MTS_~`TJ45K.png

ddkwork commented 9 months ago

package main

import ( "context" "fmt" "github.com/ddkwork/golibrary/mylog" "os" "runtime" "time"

//lint:ignore ST1001 allow dot import just to make the example more readable
. "github.com/nlpodyssey/cybertron/examples"
"github.com/nlpodyssey/cybertron/pkg/tasks"
"github.com/nlpodyssey/cybertron/pkg/tasks/textgeneration"
"github.com/rs/zerolog"
"github.com/rs/zerolog/log"
"github.com/shirou/gopsutil/v3/cpu"
"github.com/shirou/gopsutil/v3/mem"
"github.com/shirou/gopsutil/v3/process"

)

func main() { zerolog.SetGlobalLevel(zerolog.DebugLevel) LoadDotenv()

if !mylog.Error(os.Setenv("CYBERTRON_MODELS_DIR", "D:\\workspace\\lfs\\models")) {
    return
}
if !mylog.Error(os.Setenv("CYBERTRON_MODEL", textgeneration.DefaultModelForMachineTranslation("en", "zh"))) {
    return
}

modelsDir := HasEnvVarOr("CYBERTRON_MODELS_DIR", "D:\\workspace\\lfs\\models")
modelName := HasEnvVarOr("CYBERTRON_MODEL", textgeneration.DefaultModelForMachineTranslation("en", "zh"))

start := time.Now()
m, err := tasks.Load[textgeneration.Interface](&tasks.Config{ModelsDir: modelsDir, ModelName: modelName})
if err != nil {
    log.Fatal().Err(err).Send()
}

log.Debug().Msgf("Loaded model %q in %v", modelName, time.Since(start))

logMetrics()

opts := textgeneration.DefaultOptions()

fn := func(text string) error {
    start := time.Now()
    result, err := m.Generate(context.Background(), text, opts)
    if err != nil {
        return err
    }
    fmt.Println(time.Since(start).Seconds())
    fmt.Println(result.Texts[0])
    runtime.GC()
    return nil
}

err = ForEachInput(os.Stdin, fn)
if err != nil {
    log.Fatal().Err(err).Send()
}

}

func logMetrics() error { zerolog.TimeFieldFormat = zerolog.TimeFormatUnix log.Logger = log.Output(zerolog.ConsoleWriter{Out: os.Stderr})

// Get total CPU count
totalCpu, err := cpu.Counts(false)
if err != nil {
    return err
}
// Get process CPU percentage
p, err := process.NewProcess(int32(os.Getpid()))
if err != nil {
    return err
}
percent, err := p.CPUPercent()
if err != nil {
    return err
}

// Log CPU Metrics
log.Info().
    Int("total_cpu_cores", totalCpu).
    Float64("process_cpu_usage_percent", percent).
    Msg("CPU Metrics")

// Get total available RAM
vmStat, err := mem.VirtualMemory()
if err != nil {
    return err
}
// Get process RAM usage
memInfo, err := p.MemoryInfo()
if err != nil {
    return err
}

// Log RAM Metrics
log.Info().
    Float64("total_ram_available_mb", byteToMb(vmStat.Total)).
    Float64("process_ram_usage_mb", byteToMb(memInfo.RSS)).
    Msg("RAM Metrics")

return nil

}

func byteToMb(b uint64) float64 { return float64(b) / 1024 / 1024 }

ddkwork commented 9 months ago

GOROOT=C:\Program Files\Go #gosetup GOPATH=C:\Users\Admin\go #gosetup "C:\Program Files\Go\bin\go.exe" build -o C:\Users\Admin\AppData\Local\JetBrains\GoLand2023.2\tmp\GoLand_go_build_github_com_ddkwork_GolandProjects_translatetextgeneration.exe github.com/ddkwork/GolandProjects/translate/textgeneration #gosetup C:\Users\Admin\AppData\Local\JetBrains\GoLand2023.2\tmp\GoLand\go_build_github_com_ddkwork_GolandProjects_translate_textgeneration.exe INFO trace [2023-11-01 21:56:24] --------- title --------- | ------------------ info ------------------ //runtime.doInit1 proc.go:6740 {"level":"debug","file":"D:\workspace\lfs\models\Helsinki-NLP\opus-mt-en-zh\config.json","time":"2023-11-01T21:56:24+08:00","message":"model file already exists, skipping download"} {"level":"debug","file":"D:\workspace\lfs\models\Helsinki-NLP\opus-mt-en-zh\pytorch_model.bin","time":"2023-11-01T21:56:24+08:00","message":"model file already exists, skipping download"} {"level":"debug","file":"D:\workspace\lfs\models\Helsinki-NLP\opus-mt-en-zh\vocab.json","time":"2023-11-01T21:56:24+08:00","message":"model file already exists, skipping download"} {"level":"debug","file":"D:\workspace\lfs\models\Helsinki-NLP\opus-mt-en-zh\source.spm","time":"2023-11-01T21:56:24+08:00","message":"model file already exists, skipping download"} {"level":"debug","file":"D:\workspace\lfs\models\Helsinki-NLP\opus-mt-en-zh\target.spm","time":"2023-11-01T21:56:24+08:00","message":"model file already exists, skipping download"} {"level":"info","model":"D:\workspace\lfs\models\Helsinki-NLP\opus-mt-en-zh\spago_model.bin","time":"2023-11-01T21:56:24+08:00","message":"model file already exists, skipping conversion"} {"level":"fatal","error":"failed to load bart model: gob: type mismatch in decoder: want struct type nn.Param; got non-struct","time":"2023-11-01T21:56:24+08:00"}

Process finished with the exit code 1

nlpodyssey / cybertron

safe task in channel #16