State-of-the-art Machine Learning for PHP
TransformersPHP is designed to be functionally equivalent to the Python library, while still maintaining the same level of performance and ease of use. This library is built on top of the Hugging Face's Transformers library, which provides thousands of pre-trained models in 100+ languages. It is designed to be a simple and easy-to-use library for PHP developers using a similar API to the Python library. These models can be used for a variety of tasks, including text generation, summarization, translation, and more.
TransformersPHP uses ONNX Runtime to run the models, which is a high-performance scoring engine for Open Neural Network Exchange (ONNX) models. You can easily convert any PyTorch or TensorFlow model to ONNX and use it with TransformersPHP using π€ Optimum.
TO learn more about the library and how it works, head over to our extensive documentation.
Because TransformersPHP is designed to be functionally equivalent to the Python library, it's super easy to learn from
existing Python or Javascript code. We provide the pipeline
API, which is a high-level, easy-to-use API that groups
together a model with its necessary preprocessing and postprocessing steps.
Python (original) | PHP (ours) | Javascript (Xenova) |
---|---|---|
```python from transformers import pipeline # Allocate a pipeline for sentiment-analysis pipe = pipeline('sentiment-analysis') out = pipe('I love transformers!') # [{'label': 'POSITIVE', 'score': 0.999806941}] ``` | ```php use function Codewithkyrian\Transformers\Pipelines\pipeline; // Allocate a pipeline for sentiment-analysis $pipe = pipeline('sentiment-analysis'); $out = $pipe('I love transformers!'); // [{'label': 'POSITIVE', 'score': 0.999808732}] ``` | ```javascript import {pipeline} from '@xenova/transformers'; // Allocate a pipeline for sentiment-analysis let pipe = await pipeline('sentiment-analysis'); let out = await pipe('I love transformers!'); // [{'label': 'POSITIVE', 'score': 0.999817686}] ``` |
You can also use a different model by specifying the model id or path as the second argument to the pipeline
function.
For example:
use function Codewithkyrian\Transformers\Pipelines\pipeline;
// Allocate a pipeline for translation
$pipe = pipeline('translation', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english');
You can install the library via Composer. This is the recommended way to install the library.
composer require codewithkyrian/transformers
[!CAUTION] The ONNX library is platform-specific, so it's important to run the composer require command on the target platform where the code will be executed. In most cases, this will be your development machine or a server where you deploy your application, but if you're using a Docker container, run the
composer require
command inside that container.
TransformersPHP uses the PHP FFI extension to interact with the ONNX runtime. The FFI extension is included by default
in PHP 7.4 and later, but it may not be enabled by default. If the FFI extension is not enabled, you can enable it by
uncommenting(remove the ;
from the beginning of the line) the
following line in your php.ini
file:
extension = ffi
Also, you need to set the ffi.enable
directive to true
in your php.ini
file:
ffi.enable = true
After making these changes, restart your web server or PHP-FPM service, and you should be good to go.
For more detailed information on how to use the library, check out the documentation : https://codewithkyrian.github.io/transformers-php
By default, TransformersPHP uses hosted pretrained ONNX models. For supported tasks, models that have been converted to work with Xenova's Transformers.js on HuggingFace should work out of the box with TransformersPHP.
You can configure the behaviour of the TransformersPHP library as follows:
use Codewithkyrian\Transformers\Transformers;
Transformers::setup()
->setCacheDir('...') // Set the default cache directory for transformers models. Defaults to `.transformers-cache/models`
->setRemoteHost('...') // Set the remote host for downloading models. Defaults to `https://huggingface.co`
->setRemotePathTemplate('...') // Set the remote path template for downloading models. Defaults to `{model}/resolve/{revision}/{file}`
->setAuthToken('...') // Set the auth token for downloading models. Defaults to `null`
->setUserAgent('...') // Set the user agent for downloading models. Defaults to `transformers-php/{version}`
->setImageDriver('...') // Set the image driver for processing images. Defaults to `IMAGICK'
->apply(); // Apply the configuration
You can call the set
methods in any order, or leave any out entirely, in which case, it uses the default values. For
more information on the configuration options and what they mean, checkout
the documentation.
TransformersPHP only works with ONNX models, therefore, you must convert your PyTorch, TensorFlow or JAX models to ONNX. It is recommended to use π€ Optimum to perform the conversion and quantization of your model.
By default, TransformersPHP automatically retrieves model weights (ONNX format) from the Hugging Face model hub when you first use a pipeline or pretrained model. This can lead to a slight delay during the initial use. To improve the user experience, it's recommended to pre-download the models you intend to use before running them in your PHP application, especially for larger models. One way to do that is run the request once manually, but TransformersPHP also comes with a command line tool to help you do just that:
./vendor/bin/transformers download <model_identifier> [<task>] [options]
Explanation of Arguments:
[!CAUTION] Remember to add your cache directory to your
.gitignore
file to avoid committing the downloaded models to your git repository.
This package is a WIP, but here's a list of tasks and architectures currently tested and supported by TransformersPHP.
Task | ID | Description | Supported? |
---|---|---|---|
Fill-Mask | fill-mask |
Masking some of the words in a sentence and predicting which words should replace those masks. | β |
Question Answering | question-answering |
Retrieve the answer to a question from a given text. | β |
Sentence Similarity | sentence-similarity |
Determining how similar two texts are. | β |
Summarization | summarization |
Producing a shorter version of a document while preserving its important information. | β |
Table Question Answering | table-question-answering |
Answering a question about information from a given table. | β |
Text Classification | text-classification or sentiment-analysis |
Assigning a label or class to a given text. | β |
Text Generation | text-generation |
Producing new text by predicting the next word in a sequence. | β |
Text-to-text Generation | text2text-generation |
Converting one text sequence into another text sequence. | β |
Token Classification | token-classification or ner |
Assigning a label to each token in a text. | β |
Translation | translation |
Converting text from one language to another. | β |
Zero-Shot Classification | zero-shot-classification |
Classifying text into classes that are unseen during training. | β |
Task | ID | Description | Supported? |
---|---|---|---|
Depth Estimation | depth-estimation |
Predicting the depth of objects present in an image. | β |
Image Classification | image-classification |
Assigning a label or class to an entire image. | β |
Image Segmentation | image-segmentation |
Divides an image into segments where each pixel is mapped to an object. This task has multiple variants such as instance segmentation, panoptic segmentation and semantic segmentation. | β |
Image-to-Image | image-to-image |
Transforming a source image to match the characteristics of a target image or a target image domain. | β |
Mask Generation | mask-generation |
Generate masks for the objects in an image. | β |
Object Detection | object-detection |
Identify objects of certain defined classes within an image. | β |
Task | ID | Description | Supported? |
---|---|---|---|
Audio Classification | audio-classification |
Assigning a label or class to a given audio. | β |
Audio-to-Audio | N/A | Generating audio from an input audio source. | β |
Automatic Speech Recognition | automatic-speech-recognition |
Transcribing a given audio into text. | β |
Text-to-Speech | text-to-speech or text-to-audio |
Generating natural-sounding speech given text input. | β |
Task | ID | Description | Supported? |
---|---|---|---|
Tabular Classification | N/A | Classifying a target category (a group) based on set of attributes. | β |
Tabular Regression | N/A | Predicting a numerical value given a set of attributes. | β |
Task | ID | Description | Supported? |
---|---|---|---|
Document Question Answering | document-question-answering |
Answering questions on document images. | β |
Feature Extraction | feature-extraction |
Transforming raw data into numerical features that can be processed while preserving the information in the original dataset. | β |
Image Feature Extraction | image-feature-extraction |
Extracting features from images. | β |
Image-to-Text | image-to-text |
Output text from a given image. | β |
Text-to-Image | text-to-image |
Generates images from input text. | β |
Visual Question Answering | visual-question-answering |
Answering open-ended questions based on an image. | β |
Zero-Shot Audio Classification | zero-shot-audio-classification |
Classifying audios into classes that are unseen during training. | β |
Zero-Shot Image Classification | zero-shot-image-classification |
Classifying images into classes that are unseen during training. | β |
Zero-Shot Object Detection | zero-shot-object-detection |
Identify objects of classes that are unseen during training. | β |
Task | ID | Description | Supported? |
---|---|---|---|
Reinforcement Learning | N/A | Learning from actions by interacting with an environment through trial and error and receiving rewards (negative or positive) as feedback. | β |