swoole / phpy

Connecting the Python and PHP ecosystems together
Apache License 2.0
542 stars 44 forks source link

dockerfile参考,已跑通模型 #7

Closed he426100 closed 1 month ago

he426100 commented 11 months ago

RUN apt-get update && \ apt-get install -y --no-install-recommends build-essential git wget software-properties-common && \ add-apt-repository ppa:ondrej/php && apt-get update && \ DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends php-cli php-dev && \ rm -rf /var/lib/apt/lists/*

RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh && \ chmod +x ~/miniconda.sh && \ mkdir -p /opt/conda && \ ~/miniconda.sh -b -u -p /opt/conda && \ rm ~/miniconda.sh && \ /opt/conda/bin/conda init bash

ENV PATH="/opt/conda/bin:$PATH"

RUN conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia && \ conda install -c huggingface transformers

RUN git clone https://github.com/swoole/phpy.git /app/phpy

WORKDIR /app/phpy

RUN phpize && \ ./configure --with-python-dir=/opt/conda && \ make install && \ echo "extension=phpy.so" > /etc/php/8.2/cli/conf.d/20_phpy.ini

RUN php -m | grep -i phpy

CMD [ "bash" ]


- 使用

docker run --rm -it --gpus all --name phpy phpy bash

docker run -d -it --gpus all -v /data/app:/app -v /data/conda:/opt/conda -v /data/cache:/root/.cache --name phpy phpy

cd examples php pipeline.php


- 安装docker和cuda

sudo apt-get update sudo apt-get install \ ca-certificates \ curl \ gnupg \ lsb-release -y curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io -y

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update sudo apt-get install -y nvidia-docker2

wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run sudo sh ./cuda_11.8.0_520.61.05_linux.run

he426100 commented 11 months ago

define('BASE_PATH', DIR . '/..'); require BASE_PATH . '/vendor/autoload.php'; require DIR . '/utils.php';

use Dotenv\Dotenv; use Dotenv\Repository\Adapter; use Dotenv\Repository\RepositoryBuilder; use function Laravel\Prompts\text; use function Laravel\Prompts\info; use function Laravel\Prompts\error;

load_env();

$os = PyCore::import('os'); $platform = PyCore::import('platform'); $transformers = PyCore::import('transformers'); $AutoModel = $transformers->AutoModel; $AutoTokenizer = $transformers->AutoTokenizer; $torch = PyCore::import('torch');

$MODEL_PATH = getenv('MODEL_PATH') ?: 'THUDM/chatglm3-6b'; $TOKENIZER_PATH = getenv("TOKENIZER_PATH") ?: $MODEL_PATH; $DEVICE = $torch->cuda->is_available() ? 'cuda' : 'cpu';

$tokenizer = $AutoTokenizer->from_pretrained($TOKENIZER_PATH, trust_remote_code: true); if ($DEVICE == 'cuda') {

AMD, NVIDIA GPU can use Half Precision

// $model = $AutoModel->from_pretrained($MODEL_PATH, trust_remote_code: true)->to($DEVICE)->eval();
$model = load_model_on_gpus($MODEL_PATH, $torch->cuda->device_count());

} else {

CPU, Intel GPU and other GPU can use Float16 Precision Only

$model = $AutoModel->from_pretrained($MODEL_PATH, trust_remote_code: true)->float()->to($DEVICE)->eval();

}

$welcome = '欢迎使用 ChatGLM3-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序';

$past_key_values = null; $history = []; $stop_stream = false;

info($welcome);

while (true) { $query = text('用户:'); if (trim($query) == 'stop') { break; } elseif (trim($query) == 'clear') { $past_key_values = null; $history = []; info("\033c"); info($welcome); continue; } info('ChatGLM: '); try { $current_length = 0; $rs = $model->stream_chat($tokenizer, $query, history: $history, top_p: 1, temperature: 0.01, past_key_values: $past_key_values, return_past_key_values: true ); $it = PyCore::iter($rs); echo " \e[32m"; while ($next = PyCore::next($it)) { if ($stop_stream) { $stop_stream = false; break; } else { list($response, $history, $past_key_values) = PyCore::scalar($next); echo mb_substr($response, $current_length); $current_length = mb_strlen($response); } } echo "\e[39m\n"; } catch (\Throwable $e) { error($e->getMessage() ?: '执行出错了'); } }

function load_env() { $repository = RepositoryBuilder::createWithNoAdapters() ->addAdapter(Adapter\PutenvAdapter::class) ->immutable() ->make();

Dotenv::create($repository, [BASE_PATH])->safeLoad();

}

he426100 commented 11 months ago

运行examples/paddlenlp/test.php教程

he426100 commented 11 months ago

10分钟快速上手飞浆: 手写数字识别任务

ini_set('memory_limit', '2G');

$paddle = PyCore::import('paddle'); $np = PyCore::import('numpy'); $Normalize = PyCore::import('paddle.vision.transforms')->Normalize;

$transform = $Normalize(mean: [127.5], std: [127.5], data_format: 'CHW');

下载数据集并初始化 DataSet

$train_dataset = $paddle->vision->datasets->MNIST(mode: 'train', transform: $transform); $test_dataset = $paddle->vision->datasets->MNIST(mode: 'test', transform: $transform);

模型组网并初始化网络

$lenet = $paddle->vision->models->LeNet(num_classes: 10); $model = $paddle->Model($lenet);

模型训练的配置准备,准备损失函数,优化器和评价指标

$model->prepare( $paddle->optimizer->Adam(parameters: $model->parameters()), $paddle->nn->CrossEntropyLoss(), $paddle->metric->Accuracy() );

模型训练

$model->fit($train_dataset, epochs: 5, batch_size: 64, verbose: 1);

模型评估

$model->evaluate($test_dataset, batch_size: 64, verbose: 1);

保存模型

$model->save('./output/mnist');

加载模型

$model->load('output/mnist');

从测试集中取出一张图片

list($img, $label) = $test_dataset->getitem(0);

将图片shape从12828变为1128*28,增加一个batch维度,以匹配模型输入格式要求

$img_batch = $np->expand_dims($img->astype('float32'), axis: 0);

执行推理并打印结果,此处predict_batch返回的是一个list,取出其中数据获得预测结果

$out = $model->predict_batch($img_batch)[0]; $pred_label = $out->argmax(); PyCore::print(PyCore::str('true label: {}, pred label: {}')->format($label->getitem(0), $pred_label));

可视化图片

$plt = PyCore::import('matplotlib.pyplot'); $plt->imshow($img->getitem(0));

容器没有gui

$plt->imsave('./output/img.png', $img->getitem(0));



- 坑
转换`img, label = test_dataset[0]`这一句时有点费劲,试过 `list() = PyCore::scalar()`,跑不了,不知道有没有更好的办法

- 效果
![image](https://github.com/swoole/phpy/assets/9689137/7857cee7-e5c1-41aa-b98d-7f88336210c7)
he426100 commented 11 months ago

请问是否有仓库镜像,Dockerfile build 出现错误,不知怎么解决 Err:1 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2204/x86_64 InRelease The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC Err:3 http://security.ubuntu.com/ubuntu jammy-security InRelease The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 871920D1991BC93C Get:4 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [119 kB] Err:2 http://archive.ubuntu.com/ubuntu jammy InRelease The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 871920D1991BC93C Get:5 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [109 kB] Err:4 http://archive.ubuntu.com/ubuntu jammy-updates InRelease The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 871920D1991BC93C Err:5 http://archive.ubuntu.com/ubuntu jammy-backports InRelease The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 871920D1991BC93C Reading package lists...

你可以基于 modelscope (推荐)、huggingface飞浆 等官方镜像安装php环境及phpy扩展。

dependencies required for running "phpize"

(see persistent deps below)

PHPIZE_DEPS=(autoconf dpkg-dev file g++ gcc libc-dev make pkg-config re2c)

persistent / runtime deps

set -eux; \ apt-get update; \ apt-get install -y --no-install-recommends \ $PHPIZE_DEPS \ ca-certificates \ curl \ wget \ xz-utils;

PHP_INI_DIR=/usr/local/etc/php

mkdir -p "$PHP_INI_DIR/conf.d"; \

PHP_VERSION=8.3.0 PHP_URL="https://www.php.net/distributions/php-8.3.0.tar.gz"

set -eux; \ \ savedAptMark="$(apt-mark showmanual)"; \ apt-get update; \ apt-get install -y --no-install-recommends gnupg; \ \ mkdir -p /usr/src/php; \ cd /usr/src; \ \ wget -c -O php.tar.gz "$PHP_URL"; \ tar -zxvf php.tar.gz --strip-components=1 -C php;

set -eux; \ \ savedAptMark="$(apt-mark showmanual)"; \ apt-get update; \ apt-get install -y --no-install-recommends \ libcurl4-openssl-dev \ libonig-dev \ libreadline-dev \ libsodium-dev \ libsqlite3-dev \ libssl-dev \ libxml2-dev \ zlib1g-dev \ ; \ cd /usr/src/php; \ gnuArch="$(dpkg-architecture --query DEB_BUILD_GNU_TYPE)"; \ debMultiarch="$(dpkg-architecture --query DEB_BUILD_MULTIARCH)"; \ if [ ! -d /usr/include/curl ]; then \ ln -sT "/usr/include/$debMultiarch/curl" /usr/local/include/curl; \ fi; \ ./configure \ --build="$gnuArch" \ --with-config-file-path="$PHP_INI_DIR" \ --with-config-file-scan-dir="$PHP_INI_DIR/conf.d" \ \ \ --with-mhash \ \ --with-pic \ \ --enable-mbstring \ --enable-mysqlnd \ --with-sodium=shared \ --with-pdo-sqlite=/usr \ --with-sqlite3=/usr \ \ --with-curl \ --with-iconv \ --with-openssl \ --with-readline \ --with-zlib \ \ --enable-phpdbg \ --enable-phpdbg-readline \ \ --with-pear \ \ $(test "$gnuArch" = 's390x-linux-gnu' && echo '--without-pcre-jit') \ --with-libdir="lib/$debMultiarch" \ \ --enable-embed \ --enable-dom \ --enable-xml \ --enable-xmlreader \ --enable-xmlwriter \ --enable-soap \ ; \ make -j "$(nproc)"; \ make install; \ \ cp -v php.ini-* "$PHP_INI_DIR/"; \ \ cd /; \ \ php --version


- 安装phpy(modelscope-cpu版)

git clone https://github.com/swoole/phpy && cd phpy && \ phpize && \ ./configure --with-python-dir=/opt/conda && \ make install && \ echo "extension=phpy.so" > /usr/local/etc/php/conf.d/20_phpy.ini && \ php --ri phpy && \ curl -sfL https://getcomposer.org/installer | php -- --install-dir=/usr/bin --filename=composer && \ chmod +x /usr/bin/composer && composer --version && \ composer install && composer test


- 验证环境

php -r "PyCore::print(PyCore::import('modelscope.pipelines')->pipeline('word-segmentation')('今天天气不错,适合 出去游玩'));"


- 2分钟跑通模型推理

<?php $pipeline = PyCore::import('modelscope.pipelines')->pipeline; $word_segmentation = $pipeline('word-segmentation', model: 'damo/nlp_structbert_word-segmentation_chinese-base');

$input_str = '今天天气不错,适合出去游玩'; PyCore::print($word_segmentation($input_str)); // {'output': ['今天', '天气', '不错', ',', '适合', '出去', '游玩']}

hgc357341051 commented 11 months ago

能弄个cpu的的吗,只跑验证码识别

he426100 commented 11 months ago

能弄个cpu的的吗,只跑验证码识别

示例代码

<?php
/**
 * @link https://modelscope.cn/models/damo/cv_convnextTiny_ocr-recognition-general_damo/summary
 */
$pipeline = PyCore::import('modelscope.pipelines')->pipeline;
$Tasks = PyCore::import('modelscope.utils.constant')->Tasks;
$os = PyCore::import('os');
// 模型可以换成 xiaolv/ocr_small
$pipe = $pipeline($Tasks->ocr_recognition, model: 'damo/cv_convnextTiny_ocr-recognition-general_damo');
$file = './captcha.png';
file_put_contents($file, file_get_contents('https://business.swoole.com/page/captcha_register'));
echo '识别结果:' . $pipe($file)['text'][0], PHP_EOL;

环境

pip config set global.index-url https://mirrors.aliyun.com/pypi/simple
pip config set install.trusted-host mirrors.aliyun.com
pip install -U pip
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install modelscope transformers SentencePiece opencv-python
matyhtf commented 11 months ago

@he426100 示例可以加入到 phpy 的 examples 中

he426100 commented 11 months ago

能弄个cpu的的吗,只跑验证码识别

用下面这个,又快又准 https://github.com/swoole/phpy/pull/22

zeroxxmmbm commented 10 months ago

docker-php的安装脚本 基于飞桨官方进行内安装运行后报错 容器环境是ubuntu20的 请问遇到这个报错是 什么问题 dpkg是命令是正常的

++ dpkg-architecture --query DEB_BUILD_GNU_TYPE php_install.sh: line 51: dpkg-architecture: command not found

he426100 commented 10 months ago

docker-php的安装脚本 基于飞桨官方进行内安装运行后报错 容器环境是ubuntu20的 请问遇到这个报错是 什么问题 dpkg是命令是正常的

++ dpkg-architecture --query DEB_BUILD_GNU_TYPE php_install.sh: line 51: dpkg-architecture: command not found

paddle-2.6.0 需要先执行 apt install dpkg-dev pkg-config -y

编译phpy命令需改成

phpize && \
    ./configure --with-python-config=/usr/bin/python3.10-config && \
    make install && \
    echo "extension=phpy.so" > /usr/local/etc/php/conf.d/20_phpy.ini && \
    php --ri phpy
zeroxxmmbm commented 10 months ago

docker-php的安装脚本 基于飞桨官方进行内安装运行后报错 容器环境是ubuntu20的 请问遇到这个报错是 什么问题 dpkg是命令是正常的

++ dpkg-architecture --query DEB_BUILD_GNU_TYPE php_install.sh: line 51: dpkg-architecture: command not found

docker-php的安装脚本 基于飞桨官方进行内安装运行后报错 容器环境是ubuntu20的 请问遇到这个报错是 什么问题 dpkg是命令是正常的 ++ dpkg-architecture --query DEB_BUILD_GNU_TYPE php_install.sh: line 51: dpkg-architecture: command not found

paddle-2.6.0 需要先执行 apt install dpkg-dev pkg-config -y

编译phpy命令需改成

phpize && \
    ./configure --with-python-config=/usr/bin/python3.10-config && \
    make install && \
    echo "extension=phpy.so" > /usr/local/etc/php/conf.d/20_phpy.ini && \
    php --ri phpy

嗯已安装dpkg-dev pkg-config 在测试编译谢谢