EvilFreelancer / routeros-api-php

Mikrotik RouterOS API PHP client for your applications
https://mikrotik.com/software
MIT License
404 stars 150 forks source link

reading response as a stream? #13

Closed arily closed 5 years ago

arily commented 5 years ago

it's a little extreme but when I trying to print my address list which has about 80k ip/cidr, PHP runs out of memory. Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 20480 bytes) in /~~~/vendor/evilfreelancer/routeros-api-php/src/Client.php on line 252.

EvilFreelancer commented 5 years ago

Hello @arily ! Interesting problem, this probably happens because the entire RouterOS response is read and placed into memory before it is processed and moving to following methods of your script.

I like your idea about stream conseption of response from router, I will be glad if you can implement this solution in code, it will significantly speed up the release of the new version.

Also maybe for current sollution better to use yield generators, I will try it today after my work.

But for right now you probably may use RAW results from your router without parsing/processing, for this try ->read() method with false value, eg ->read(false). You will receive non-formatted results, but it's better than nothing :)

UPD: By the way, you may add line like this ini_set('memory_limit','512M'); to top of your script, it should increase memory limits.

EvilFreelancer commented 5 years ago

Greeting @arily , any suggested solution is helped to you or you still has problems with memory?

arily commented 5 years ago

Hi, I checked my memory usage and it seems parsing raw respond caused memory exhausted. I am currently working on an Array Like object which will parsing response only when access it, this will trade memory using cpu time which will always considering cheaper compare to memory in most situations. I have almost no idea how ros' response parsed so I am only aiming on /ip/firewall/address-list/print. I will let you know if I made it work and see if we can make more use of it. yield generators should also works fine, but I have not tried it yet.

https://github.com/arily/RouterOSResponseArray I post a sketchy version here.

EvilFreelancer commented 5 years ago

Hello! Sorry for very long response, I was on vocation and just forgot about your issue.

I've tried to use \Iterators etc but seems it's not-required, because issue was in parsing of large array response from RouterOS, I've created rosario method based on your solution and replaced parseResponse to it.

I hope your problem is now solved :) Update with be with tag 1.0 (first stable release, yeah)

PS All tests pass without errors

EvilFreelancer commented 5 years ago

I hope you can check the develop version from the master branch and confirm that the problem is solved before I've release 1.0

arily commented 5 years ago

Congratulations for upcoming stable release! I will check it as soon as I could!

arily commented 5 years ago

Hi,

I tested it on my router contains 27564 Address-list record (yes more extreme than before), and parse only 100 record, and this test seems failed. and now the script runs too long (nginx's default is 60s)

2019/07/21 18:06:55 [error] 10294#10294: *6047143 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 162.158.118.184, server: accounting.ri.mk, request: "POST /AddressList/query/ HTTP/1.1", upstream: "fastcgi://unix:/run/php/php7.3-fpm.sock", host: "accounting.ri.mk", referrer: "https://accounting.ri.mk/AddressList/query"

It looks like that parsing cost too much of time.

and old ROSRA works. Probably because it's only parsing response when required. so I think it's still significative to implement \Iterators.

here is the fake code

function query($query){
        set_time_limit(120);
        $respond = [];
        try{
                        $client = new Client([
                'host' => 'h.o.s.t',
                'user' => 'admin',
                'pass' => 'pass'
            ]);
            $respond = $client->wr(['/ip/firewall/address-list/print','?'.$query]);
        } catch (\Exception $e){
            var_dump($e);
        }
        return $respond;
    }
function query_ROSRA($query){
        set_time_limit(120);
        $respond = [];
        try{
                        $client = new Client([
                'host' => 'h.o.s.t',
                'user' => 'admin',
                'pass' => 'pass'
            ]);
            $respond = new ROSRA($client->wr(['/ip/firewall/address-list/print','?'.$query]),false);
        } catch (\Exception $e){
            var_dump($e);
        }
        return $respond;
    }

//test

$data['limit'] = (isset($data['limit'])) ? $data['limit'] : 100;
$data['trim'] = (isset($data['trim'])) ? $data['trim'] : 0;
if ($return = query('list=long');){ //OR $return = query_ROSRA('list=long'); 
    $trim = $data['trim'];
    $length = $data['limit'];
    $limit = $trim + $length;
    $arr = [];
    for ($i = $trim ;$i < $limit; $i++){
        if (isset($return[$i])){
            $arr[] = $return[$i];
        }
    }
        echo json_encode ['status' => TRUE,'data'=>$arr,'input'=>$data];
}
EvilFreelancer commented 5 years ago

Hi!

Okay, the speed issue is better than the memory usage issue (usually have to choose one from these), I think you can slightly increase nginx timeout.

PS. Why you use web-server for this? You can write CLI tool which wouldn't have issues with web-server limits.

php -f ./my-super-script.php

PS/2 Loop in your code is not okay, better to use array_slice:

$data['limit'] = $data['limit'] ?? 100;
$data['trim']  = $data['trim'] ?? 0;

if ($return = query('list=long')) {
    $arr = array_slice($return, $data['trim'], $data['limit']);
    echo json_encode(['status' => true, 'data' => $arr, 'input' => $data]);
}

Also, you may use caching, as you know, your code asked the router every time when you asked web-server with different limits, I recommend to use memcached or ordinary serialize/unserialize (it's also better than nothing).

EvilFreelancer commented 5 years ago

Hm, probably your speed issue was on loop stage, probably need to check the result from router first, then check the speed of loop:

$data['limit'] = $data['limit'] ?? 100;
$data['trim']  = $data['trim'] ?? 0;

$return = query('list=long');
var_dump($return);

if ($return) {
    $arr = array_slice($return, $data['trim'], $data['limit']);
    echo json_encode(['status' => true, 'data' => $arr, 'input' => $data]);
}
arily commented 5 years ago

hi, it's just a test code and I agree as you said I should using cli. I don't have access to ssh only if I using my IP address at home. But I can upload my script to my server. ROSRA are using \Iterators so any of the array_* function is not usable. I have no choice but looping in my code when ROSRA are used.

I set a timer and it shows this loop actually not consuming that much of time. (using microtime(true) so the unit will be second. time: fetch_data: //fetch data. (in old ROSRA this will be raw data and in new code this will be parsed data) Looping: //looping from 0 to 100 (used array_slice in new code and ROSRA's parsing time cost are covered here) time:

result of ROSRA

time: fetch_data: 5.3953399658203 Looping: 0.00049495697021484 time: fetch_data: 5.4316501617432 Looping: 0.00051712989807129 time: fetch_data: 5.4239640235901 Looping: 0.00058293342590332

and the result new code

time: fetch_data: 91.435431957245 Looping: 2.4080276489258E-5 time: fetch_data: 94.435980081558 Looping: 2.7894973754883E-5 time: fetch_data: 91.331300020218 Looping: 2.0980834960938E-5

I tested it 3 times.

I do agree with you that this are more likely won't in any production environment, but if there's a way to improve overall performance why not? :P at least add readAsIterators function? normal read will cover 99% of situations and this could potentially cover the last 1%.

EvilFreelancer commented 5 years ago

I like idea with readAsIterators, it will not be able to damage the current implementation of logic, for example login() method depends on raw response from RouterOS API.

Can you create Pull Request with your solution of this? If you don't know how then I can write it.

arily commented 5 years ago

I created it. check it out

EvilFreelancer commented 5 years ago

Hello! Thanks for you PR, I've slightly updated codebas in #14 and added some documentation, plus I've updated readAsIterator, now it will receive Client object instread ->read(false) results. Btw, please read about PSR, I guess you should enable PSR for autoformating in your IDE.

EvilFreelancer commented 5 years ago

If you have time, please check my updates, and if everything is okay, I will merge your changes.

arily commented 5 years ago

Hello! Thanks for you PR, I've slightly updated codebas in #14 and added some documentation, plus I've updated readAsIterator, now it will receive Client object instread ->read(false) results. Btw, please read about PSR, I guess you should enable PSR for autoformating in your IDE.

thanks for the advice! I am set it to PSR!

arily commented 5 years ago

If you have time, please check my updates, and if everything is okay, I will merge your changes.

It looks fine to me.

EvilFreelancer commented 5 years ago

Okay, then well done :)

EvilFreelancer commented 5 years ago

Release will be soon, just need add details about Iterator method to README file