eve-seat / seat

SeAT 0x. [UNSUPPORTED]
https://github.com/eveseat/seat
MIT License
69 stars 37 forks source link

Asset list update taking too long #401

Closed MilosRasic closed 7 years ago

MilosRasic commented 9 years ago

Basically, I frequently see jobs like this: assetlistupdate

As a result, I'm getting a lot of jobs timing out after 60 minutes in the queue because I have only 1 worker running

Is this normal? Since jackknife is having no problems pulling everything from full API keys in the matters of seconds, is it possible that my IP is getting throttled by the API servers? Is it possible to get some insight into what exactly the specific job is doing while it's working?

warlof commented 9 years ago

What's about redis setting ? How many keys should be checked ? Take a look inside laravel.log file Le 23 juin 2015 12:31, Miloš Rašić notifications@github.com a écrit :Basically, I frequently see jobs like this:

As a result, I'm getting a lot of jobs timing out after 60 minutes in the queue because I have only 1 worker running

Is this normal? Since jackknife is having no problems pulling everything from full API keys in the matters of seconds, is it possible that my IP is getting throttled by the API servers? Is it possible to get some insight into what exactly the specific job is doing while it's working?

—Reply to this email directly or view it on GitHub.

MilosRasic commented 9 years ago

I have recently updated Redis to 3.0+ because the SEAT code requires at least 2.4 (dank multi-argument rpush), and mine was 2.0 (should probably add that to installation instructions). Didn't fiddle with settings.

I have a medium-sized nullsec alliance nubmer of API keys.

I noticed that after completing the job in the screenshot, my worker was idle for quite a while (10 - 15 minutes) before picking up the next one, even though the queue wasn't empty.

I also have a mismatch between MySQL and Redis queues, at the time of typing, 4548 jobs in Redis, 585 in db. Is there a way to clean up the phantom jobs from Redis? queue:flush command didn't help.

Nothing suspicious in the logs except a couple of 'String could not be parsed as XML' errors from Pheal which happen to me from time to time and are extremely annoying because they don't log the string itself.

Speaking of logs, you should totally use Log::useDailyFiles(storage_path().'/logs/laravel.log'); in start/global.php. If it's included in the master branch, make sure you note in the installation that the cronjob should be set up for the web server user so that web server can access log files created by commands ran by the cronjob.

ghost commented 9 years ago

To flush the Redis you could do a redis-cli -n 0 flushdb (if you use the default db 0)

MilosRasic commented 9 years ago

Thanks. I've emptied both queues and everything is at 0. Hopefully my SEAT doesn't go whacky again as soon as it gets new jobs :)

btw, is it safe to delete old finished jobs? 250k+ doesn't seem like a healthy number of rows if it's information that's not needed.

MilosRasic commented 9 years ago

A couple of hours after a complete reset, 3 hour old Asset List Update job, 400+ jobs timed out, 600 jobs in queue. I delete failed jobs, but the Redis number doesn't go down, it's still at 1000+. After a couple of minutes, the Assets job is done and others start going through the worker, then the Working Jobs list is empty for a while. Refreshing gives me a slightly varying Redis number on each refresh. Finally, the worker picks up another job and the Redis number stabilizes, but still at 1000+ with 500+ jobs in DB queue.

Is this normal behaviour?

MilosRasic commented 9 years ago

I think I've found the culprit regarding slow asset list updates. Basically, using ORMs is chill for small data sets, but when you have thousands of characters each with you know how many assets some people can accumulate, and you are running a separate INSERT query for each of those assets, you are going to run into a large overhead, even when database is on the same server as the application.

I've rewritten the inserts inside the loops in app/eveapi/character/CharacterAssetList.php to use Query Builder instead of Eloquent and thus run batch INSERT instead of one INSERT per asset item, like this:

            // Populate the assets for this character as well as the contents.
            $assets = array();
            $contents = array();
            foreach ($asset_list->assets as $asset) {

                /*$asset_data = new \EveCharacterAssetList;

                $asset_data->characterID = $characterID;
                $asset_data->itemID = $asset->itemID;
                $asset_data->locationID = $asset->locationID;
                $asset_data->typeID = $asset->typeID;
                $asset_data->quantity = $asset->quantity;
                $asset_data->flag = $asset->flag;
                $asset_data->singleton = $asset->singleton;
                $asset_data->save();*/

                $assets []= array(
                    'characterID' => $characterID,
                    'itemID' => $asset->itemID,
                    'locationID' => $asset->locationID,
                    'typeID' => $asset->typeID,
                    'quantity' => $asset->quantity,
                    'flag' => $asset->flag,
                    'singleton' => $asset->singleton
                );

                // Process the contents if there are any
                if (isset($asset->contents)) {

                    foreach ($asset->contents as $content) {

                        /*$content_data = new \EveCharacterAssetListContents;

                        $content_data->characterID = $characterID;
                        $content_data->itemID = $asset_data->itemID;
                        $content_data->typeID = $content->typeID;
                        $content_data->quantity = $content->quantity;
                        $content_data->flag = $content->flag;
                        $content_data->singleton = $content->singleton;
                        $content_data->rawQuantity = (isset($content->rawQuantity) ? $content->rawQuantity : 0);

                        $asset_data->contents()->save($content_data);*/
                        $contents []= array(
                            'characterID' => $characterID,
                            'itemID' => $asset->itemID,
                            'typeID' => $content->typeID,
                            'quantity' => $content->quantity,
                            'flag' => $content->flag,
                            'singleton' => $content->singleton,
                            'rawQuantity' => (isset($content->rawQuantity) ? $content->rawQuantity : 0)
                        );
                    }
                }
            }

            DB::table('character_assetlist')->insert($assets);
            DB::table('character_assetlist_contents')->insert($contents);

You can also do this for corporation assets, including the asset locations.

Sadly, I can't make a push request because my SEAT is heavily modified with enhancements that would be of no interest to the main project and it would take too long to untangle the mess I've made by not developing them on a separate branch.

I'd still like to know whether the extra jobs in Redis queue are normal. Thanks.

MilosRasic commented 9 years ago

Another update. Turns out that MySQL PDO can't handle the INSERT of that size. There's a hard cap of 65k-ish placeholders per prepared statement, and, yes, obviously some people have more than 10k asset items.

The good thing is that I found my queue empty in the morning which proves that long AssetList jobs were the cause of my performance issues.

I've replaced Builder::insert() calls with DB::unprepared() calls. Unfortunately, there seems to be no way to get Laravel Query Builder to compile raw INSERTs, so I had to build them manually.

eve-seat commented 9 years ago

Thanks for looking into this in such detail! I didn't check everything out properly yet, but could you help me understand in more detail where exactly the bottleneck is coming from based on your findings @MilosRasic ?

MilosRasic commented 9 years ago

Problem: Some asset list updates are taking very long time.

Cause: Since jackknife has no problems pulling and parsing asset lists for those players, we can assume that API calls are not the problem. The problem is in the fact that Eloquent is used to save assets in a loop that loops over API call results, saving one asset item per iteration. There's a few small overheads caused by using Eloquent: many function calls, prepared statements, making DB connection and running a query for each saved asset item.

Solution: Use multiple inserts, which use a single large query to insert all the data. My first idea was to use Laravel's Query Builder: DB::table()->insert() which will take a numeric array of associative arrays to perform a multiple insert query. This leads us to the second problem.

Problem 2: There's a hard cap on the number of placeholders in a prepared statements in MySQL and MariaDB. Query Builder uses prepared statements only. If there's more placeholders in a query than this cap, the query will return an error.

Solution: Use unprepared statements with DB::unprepared(). Building a query manually is required.

The code, for corporation asset list updates, because they also have asset locations which are not present in character asset list updates:

        // Check if the data in the database is still considered up to date.
        // checkDbCache will return true if this is the case
        if (!BaseApi::checkDbCache($scope, $api, $asset_list->cached_until, $corporationID)) {

            // TODO: Look as how we update this. As per https://neweden-dev.com/Character/Asset_List, the itemID
            // may change. So, we cant really just update existing ids. For now, we just trash all and recreate the
            // assets :<
            // Maybe do this in one big transaction? lel

            \EveCorporationAssetList::where('corporationID', '=', $corporationID)->delete();
            \EveCorporationAssetListContents::where('corporationID', '=', $corporationID)->delete();

            // Note about /corp/Locations
            //
            // We will build a list of itemID's to update the location information about
            // while we loop over the assets. This will be stored in location_retreive and
            // processed here [1]
            $location_retreive = array();

            // Populate the assets for this corporation as well as the contents.
            $assets = array();
            $contents = array();
            foreach ($asset_list->assets as $asset) {

                // Add the asset to the location retreival array
                $location_retreive[] = $asset->itemID;

                // Process the rest of the database population
                $assets []= '('.implode(', ', array(
                    'corporationID' => $corporationID,
                    'itemID' => $asset->itemID,
                    'locationID' => $asset->locationID,
                    'typeID' => $asset->typeID,
                    'quantity' => $asset->quantity,
                    'flag' => $asset->flag,
                    'singleton' => $asset->singleton
                )).')';

                // Process the contents if there are any
                if (isset($asset->contents)) {

                    foreach ($asset->contents as $content) {
                        $contents []= '('.implode(', ', array(
                            'corporationID' => $corporationID,
                            'itemID' => $asset->itemID,
                            'typeID' => $content->typeID,
                            'quantity' => $content->quantity,
                            'flag' => $content->flag,
                            'singleton' => $content->singleton,
                            'rawQuantity' => (isset($content->rawQuantity) ? $content->rawQuantity : 0)
                        )).')';
                    }
                }
            }

            if (count($assets))
                DB::unprepared('INSERT INTO corporation_assetlist (corporationID, itemID, locationID, typeID, quantity, flag, singleton) VALUES '.implode(', ', $assets));
            if (count($contents))
                DB::unprepared('INSERT INTO corporation_assetlist_contents (corporationID, itemID, typeID, quantity, flag, singleton, rawQuantity) VALUES '.implode(', ', $contents));

            // Now empty and process the locations as per [1]
            \EveCorporationAssetListLocations::where('corporationID', '=', $corporationID)->delete();
            $location_retreive = array_chunk($location_retreive, 1);

            // Iterate over the chunks.
            foreach ($location_retreive as $chunk) {

                try {

                    $locations = $pheal
                        ->corpScope
                        ->Locations(array('characterID' => $characters[0], 'ids' => implode(',', $chunk)));

                } catch (\Pheal\Exceptions\PhealException $e) {

                    // Temp hack to check the asset list thingie
                    // TBH, I am not 100% sure yet why the freaking call would fail for a id we **just**
                    // got from the previous API call...
                    if ($e->getCode() == 135 || $e->getCode() == 221)   // 135 "not the owner" | 221 "illegal page request"
                        continue;
                    else
                        throw $e;
                }

                // Loop over the locations, check their closest celestial
                // and add the data to the database
                $assetLocations = array();
                foreach ($locations->locations as $location) {

                    $closest_moon = BaseApi::findClosestMoon($location->itemID, $location->x, $location->y, $location->z);

                    $assetLocations []= '('.implode(', ', array(
                        'corporationID' => $corporationID,
                        'itemID' => $location->itemID,
                       'itemName' => DB::connection()->getPdo()->quote($location->itemName),
                        'x' => $location->x,
                        'y' => $location->y,
                        'z' => $location->z,
                        'mapID' => $closest_moon['id'] ? $closest_moon['id'] : 'NULL',
                        'mapName' => $closest_moon['name'] ? DB::connection()->getPdo()->quote($closest_moon['name']) : 'NULL'
                    )).')';
                }

                //DB::table('corporation_assetlist_locations')->insert($assetLocations);
                if (count($assetLocations))
                    DB::unprepared('INSERT INTO corporation_assetlist_locations (corporationID, itemID, itemName, x, y, z, mapID, mapName) VALUES '.implode(', ', $assetLocations));
            }

            // Update the cached_until time in the database for this api call
            BaseApi::setDbCache($scope, $api, $asset_list->cached_until, $corporationID);
        }
ghost commented 9 years ago

has this been added to the project? I'm having the same issues as you

eve-seat commented 9 years ago

@ChrisKing517 not yet. But soon™ :)

eve-seat commented 9 years ago

I am finally looking at this for the SeAT 1.x EVE API code here. I am not a fan of the unprepared statements, and it also breaks the fact that I am trying to keep the code database agnostic (as far as possible for now). I think it will be best to try chunking the assets list then so that we don't hit the max inserts limit. Will link the new code here once done.

eve-seat commented 9 years ago

Okaaaay. Just pushed the Character AssetsList Updater to the 1.x codebase. @MilosRasic could you have a look at this file and let me know what you think?