RubixML / Tensor

A library and extension that provides objects for scientific computing in PHP.
https://rubixml.com
MIT License
223 stars 27 forks source link

PHP 8.2 compatibility #33

Closed myfluxi closed 9 months ago

myfluxi commented 1 year ago

In case you need to move to PHP 8.2 before @andrewdalpino find's time to update Tensor:

https://github.com/myfluxi/Tensor/commit/a6c229b762af5227a85db295e1d7451ab7edc8e9

sneakyimp commented 1 year ago

EDIT: I managed to compile your Update code. Details below

On my Ubuntu 20.04 workstation (running PHP 8.2), I have carefully followed the instructions for Manually Compiling the Extension except instead of cloning the RubixML repo, I clone your myfuxi one instead:

# not this repo
# git clone https://github.com/RubixML/Tensor
# this one instead!
git clone https://github.com/myfluxi/Tensor
# and check out Update for PHP 8.2
git checkout a6c229b762af5227a85db295e1d7451ab7edc8e9
# continue with other steps...

I had to fiddle with my PHP ini files in CLI and FPM folders, etc., but I now see the Tensor extension listed. I then tried to run this script, which loads a 1375x1966 matrix out of a JSON file and tries to multiply it by its inverse. It ends with a segfault:

<?php
/**
 * script to try and use the PECL tensor extension to train a spam filter
 */

$json_file = realpath(__DIR__ . '/../training_data_sets.json');
$data = json_decode(file_get_contents($json_file), TRUE);
echo "rows: " . sizeof($data['Xtrain']) . "\n";
echo "cols: " . sizeof($data['Xtrain'][0]) . "\n";

use Tensor\Matrix;

$m = new Matrix($data['Xtrain']);

$mt = new Matrix($data['Xtrain']);
$mt->transpose();

$start = microtime(TRUE);
$mr = $m->multiply($mt);
echo (microtime(TRUE) - $start), " seconds elapsed for multiply\n";

//$arr = $mr->asArray();
//echo "rows: " . sizeof($arr) . "\n";
//echo "cols: " . sizeof($arr[0]) . "\n";

The result:

$ php bar2.php
rows: 1375
cols: 1966
Segmentation fault (core dumped)

Bizarrely, if I remove the comments, this script claims to finish very quickly:

<?php
$json_file = realpath(__DIR__ . '/../training_data_sets.json');
$data = json_decode(file_get_contents($json_file), TRUE);
echo "rows: " . sizeof($data['Xtrain']) . "\n";
echo "cols: " . sizeof($data['Xtrain'][0]) . "\n";

use Tensor\Matrix;

$m = new Matrix($data['Xtrain']);

$mt = new Matrix($data['Xtrain']);
$mt->transpose();

$start = microtime(TRUE);
$mr = $m->multiply($mt);
echo (microtime(TRUE) - $start), " seconds elapsed for multiply\n";
unset($m);
unset($mt);

$v = $mr->asArray();
echo "rows: " . sizeof($v) . "\n";
echo "cols: " . sizeof($v[0]) . "\n";

The result of the latter code:

rows: 1375
cols: 1966
0.054421901702881 seconds elapsed for multiply
rows: 1375
cols: 1966

The latter dimensions are incorrect. Should be 1375 x 1375, I think.

Something seems very unstable here.

andrewdalpino commented 1 year ago

Do you want to give it a try with the latest version (3.0.3) of the extension? This would be the first version that supports PHP 8.2.

remicollet commented 1 year ago

PHP 8.1 and 8.2 support is still not complete, iterator methods don't have the proper prototype.

$ php  --re tensor
Deprecated: Return type of Tensor\Vector::offsetGet($index) should either be compatible with ArrayAccess::offsetGet(mixed $offset): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in Unknown on line 0
Deprecated: Return type of Tensor\Vector::getIterator() should either be compatible with IteratorAggregate::getIterator(): Traversable, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in Unknown on line 0
Deprecated: Return type of Tensor\Matrix::getIterator() should either be compatible with IteratorAggregate::getIterator(): Traversable, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in Unknown on line 0
andrewdalpino commented 1 year ago

PHP 8.1 and 8.2 support is still not complete, iterator methods don't have the proper prototype.

$ php  --re tensor
Deprecated: Return type of Tensor\Vector::offsetGet($index) should either be compatible with ArrayAccess::offsetGet(mixed $offset): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in Unknown on line 0
Deprecated: Return type of Tensor\Vector::getIterator() should either be compatible with IteratorAggregate::getIterator(): Traversable, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in Unknown on line 0
Deprecated: Return type of Tensor\Matrix::getIterator() should either be compatible with IteratorAggregate::getIterator(): Traversable, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in Unknown on line 0

I noticed this too @remicollet. I'm not sure how we'd go about adding these attributes to the extension code, however.

remicollet commented 1 year ago

I don't know anythng about zephir things (excepted this add much complexity)

BTW, phalcon seems to have some iterators:

See

mlocati commented 9 months ago

About the warning about the return types: see #35

I think this #36 is about the segmentation fault we have with this sample code.

By running

strace php -r 'var_dump(Tensor\Matrix::ones(2,2)->inverse());'

we have that the last lines of the output are:

write(1, "\nDeprecated: Use of \"self\" in ca"..., 85
Deprecated: Use of "self" in callables is deprecated in Command line code on line 1
) = 85
write(1, "\nDeprecated: Use of \"self\" in ca"..., 85
Deprecated: Use of "self" in callables is deprecated in Command line code on line 1
) = 85
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x10} ---
+++ killed by SIGSEGV +++

So, the segfault may be caused by using self in callables...