backdrop / backdrop-issues

Issue tracker for Backdrop core.
144 stars 39 forks source link

[D8][PS] Integrate the functionality of the entity_cache module into core #74

Closed jenlampton closed 4 years ago

jenlampton commented 11 years ago

...but not as a module... just as part of Backdrop, maybe part of the entity system?


Advocate for this issue @hosef


Weekly update?


PR by @hosef: https://github.com/backdrop/backdrop/pull/2927 Change record: https://api.backdropcms.org/change-records/entity-caching-added-to-core

damienmckenna commented 11 years ago

Comparable D8 issue, for reference: https://drupal.org/node/597236

duckzland commented 10 years ago

+1 for entity cache

matt2000 commented 10 years ago

Why shouldn't this be a module?

quicksketch commented 10 years ago

This makes sense being part of core (and always enabled) because entity caching is a level of cache that is reliable and easily cleared and updated automatically when corresponding entities are updated. As long as the appropriate APIs are used (e.g entity_save(), node_save()), adding entity-level caching means that we can save database queries not only for fields, but also for entries in other tables, such as the webform* tables for Webform, path redirects, metadata, or anything else that's not a field. Adding the entity-level cache also means we could remove the field-level cache, since at that point they'd be redundant.

matt2000 commented 10 years ago

I agree that it should be in core and enabled by default, just not that it should be something other than it's own module. It can be hidden if you want, but it should still be replaceable by developers.

I don't agree that it is redundant with field level cache. I'm pretty sure the reasons are covered adequately in the D8 issue discussion.

quicksketch commented 10 years ago

I don't agree that it is redundant with field level cache. I'm pretty sure the reasons are covered adequately in the D8 issue discussion.

I couldn't find explanations in the D8 issue for why the field cache and entity cache are both necessary. I can't figure a reason why both are independently needed, other than if you could disable the entity cache then you'd want the separate field cache. However, entire-object caching makes sense to always have enabled because there's no downside for end-users. There's no risk of stale data because we know when the node/user/term is being updated (via entity_save()) and can clear the cache. It's similar to the filter caches that can't ever be disabled. If there's no risk of stale data, the cache should always enabled.

damienmckenna commented 10 years ago

D8 has also changed to storing all field data with the entity bundle itself, rather than as independent objects, so it makes sense to cache the full objects. There's a discussion in https://drupal.org/node/597236 that may be helpful.

matt2000 commented 10 years ago

https://github.com/backdrop/backdrop-issues/issues/74

Converted just enough to eliminate WSODs and get the tests to run. I'm waiting to see if the test pass, but wanted to share early.

quicksketch commented 9 years ago

This is a good feature for Backdrop for certain. If it's possible to add to a 1.x minor release let's still try to have this as a replacement for field cache, but if not, I'd love to see this in the 2.0 release.

hosef commented 4 years ago

I think I have everything done for this except for one test case that is failing. I have spent a while trying to debug it, and I have no idea why it is failing.

docwilmot commented 4 years ago

@hosef I'm curious; the test fail is $this->assertEqual($url, $node->{$this->field_name}['und'][0]['url']);. Does $node actually exist at that point, and is it the $node you expect?

hosef commented 4 years ago

@docwilmot The node does exist, but it looks like there isn't a field on the node. When I put a debug message in there to print out the node, then it will be missing the field property entirely.

docwilmot commented 4 years ago

But is it the node you expect (should be NID == 3) or is it one of the default nodes from the install profile?

hosef commented 4 years ago

The nid is 1. The tests don't use the default profile, so there is no content unless you make it as part of the test.

I modified the test to add a couple of debug statements, so now it is as follows:

    $test = db_query("SELECT * FROM {field_data_{$this->field_name}}")->fetchAll();
    debug($test);

    $nid = db_query("SELECT MAX(nid) FROM {node}")->fetchField();
    $node = node_load($nid, NULL, TRUE);
    debug($node);
    $this->assertEqual($url, $node->{$this->field_name}['und'][0]['url']);

The output from the first debug is:

array (
  0 => 
  (object) array(
     'entity_type' => 'node',
     'bundle' => 'page',
     'deleted' => '0',
     'entity_id' => '1',
     'revision_id' => '1',
     'language' => 'und',
     'delta' => '0',
     'field_wdnn9v1r_url' => 'javascript:alert("http://example.com/")',
     'field_wdnn9v1r_title' => 'w87pVUF8',
     'field_wdnn9v1r_attributes' => 'a:0:{}',
  ),
)

And the second is:

Node::__set_state(array(
   'nid' => '1',
   'vid' => '1',
   'type' => 'page',
   'langcode' => 'und',
   'title' => 'Simple title',
   'uid' => '2',
   'status' => '1',
   'created' => '1570843057',
   'changed' => '1570843057',
   'scheduled' => '0',
   'comment' => '0',
   'promote' => '0',
   'sticky' => '0',
   'tnid' => '0',
   'translate' => '0',
   'revision_timestamp' => '0',
   'revision_uid' => '0',
   'in_preview' => NULL,
   'log' => '',
   'body' => 
  array (
  ),
   'name' => 'qUOPSXYE',
   'picture' => '0',
   'data' => NULL,
))

So, we know that there is a node and there is data in the field table, but for some reason the field data is not being attached to the node when loading it.

docwilmot commented 4 years ago

Had a look at the PR. The fail happens because of the call to $reset in $node = node_load($nid, NULL, TRUE) (ie the TRUE). So in DefaultEntityController::resetCache() if you comment out the line cache_flush('cache_entity_' . $this->entityType) the test goes green.

Thats the reason for the fail, but cant tell you the fix; hopefully this is helpful.

hosef commented 4 years ago

Ah, thanks. I changed the test to just load the node normally, and it passes now.

klonos commented 4 years ago

That is great news! ...how do we test this? Benchmarks?

Also, related: #4127

docwilmot commented 4 years ago

@hosef I can understand that the fix works, but I'm still not comfortable not knowing why it fails. This suggests to me that under some circumstances, node loads with resets might fail to load fields with entity caching.

hosef commented 4 years ago

@klonos I am not sure how you would test other than making sure nothing looks broken. I came up with a quick benchmark that loads 1000 random nodes 1000 times and put it into a menu callback so I could trigger it by loading a page.

  $controller = entity_get_controller('node');
  $runs = array();

  for ($index = 0; $index < 1000; $index++) {
    $controller->resetStaticCache();

    $nids = db_query('SELECT nid FROM node ORDER BY rand() LIMIT 1000')->fetchCol();

    $start = microtime(TRUE);
    $nodes = node_load_multiple($nids);
    $end = microtime(TRUE);

    $runs[] = $end - $start;
  }

  $total = array_sum($runs);
  backdrop_set_message('Maximum: ' . max($runs));
  backdrop_set_message('Minimum: ' . min($runs));
  backdrop_set_message('Average: ' . ($total / count($runs)));
  backdrop_set_message('Total: ' . $total);

Setup:

Then I cleared caches with drush and loaded the page. Before changes:

Maximum: 2.0363099575043s
Minimum: 0.061290979385376s
Average: 0.083437194108963s
Total: 83.437194108963s

After changes:

Maximum: 2.7388620376587s
Minimum: 0.021826982498169s
Average: 0.045876381874084s
Total: 45.876381874084s

@docwilmot I am also a bit uncomfortable with that, but I have so far been unable to replicate the behavior outside of that test or figure out why it happened in the first place.

quicksketch commented 4 years ago

This seems like it could be a candidate for 1.15.0 with the recent progress on this work.

oadaeh commented 4 years ago

It was requested in the dev meeting yesterday that some functionality be added such that when a new entity type is added, there are no errors due to there not being support in entity cache for the new entity type. In addition to that, there should be functionality for removing what ever gets added, if and when the entity type gets removed.

hosef commented 4 years ago

Since there was some confusion about how much of a difference this was going to make, I came up with some more tests to demonstrate the difference in performance.

First, I ran a test that just loaded all of the nodes from the previous test at once and printed the time. I ran it in a loop 5 times, clearing the static cache between each run.

$controller = entity_get_controller('node');

for ($index = 0; $index < 5; $index++) {
  $controller->resetStaticCache();

  // This gets all of the nids on my test site, of which there are ~10000.
  $nids = db_query('SELECT nid FROM node')->fetchCol();

  $start = microtime(TRUE);
  $nodes = node_load_multiple($nids);
  $end = microtime(TRUE);
  backdrop_set_message(($end - $start) . 's');
}

Before the changes:

48.404771089554s
10.591112852097s
10.397599935532s
10.377990007401s
10.461045026779s

After the Changes:

49.019636154175s
0.65570998191833s
0.63378691673279s
0.64661407470703s
0.64904284477234s

This shows that bulk loading 10000 simple nodes is slightly slower when the cache is empty, but is about 15x faster when they are fully cached. NOTE: It is unlikely that anyone will see a 15x improvement on a live site because this is using a very simple node and bulk loading 10000 of them which is not something that is normally done on a website.

The second test was using a setup with a custom node type with 8 fields, including a paragraphs embed field. The first paragraph then had 3 more embed field to 3 more paragraph types that each had 8 fields. I used a script to generate 1000 nodes of this type and populate the required 4000 paragraph items, then I ran a test using the following code.

$node_controller = entity_get_controller('node');
$paragraph_controller = entity_get_controller('paragraphs_item');

for ($index = 0; $index < 5; $index++) {
  $node_controller->resetStaticCache();
  $paragraph_controller->resetStaticCache();

  $nids = db_query("SELECT nid FROM node WHERE type = 'paragraph_test'")->fetchCol();

  $start = microtime(TRUE);

  $nodes = node_load_multiple($nids);
  foreach($nodes as $nid => $node) {
    $paragraph = paragraphs_item_load($node->field_paragraph_embed[LANGUAGE_NONE][0]['value']);
    $paragraphs = paragraphs_item_load_multiple(array(
      $paragraph->field_test2_embed[LANGUAGE_NONE][0]['value'],
      $paragraph->field_test3_embed[LANGUAGE_NONE][0]['value'],
      $paragraph->field_test4_embed[LANGUAGE_NONE][0]['value'],
    ));
  }

  $end = microtime(TRUE);

  backdrop_set_message(($end - $start) . 's');
}

Before changes:

30.033178091049s
1.0421001911163s
0.98548102378845s
1.0120670795441s
1.0029308795929s

After changes:

33.616456985474s
0.39169788360596s
0.39236688613892s
0.3840639591217s
0.39206504821777s

Again, it is slightly slower on the first run, but it is about 2.5x as fast on the subsequent runs.

hosef commented 4 years ago

Also, the system as it currently exists is opt-in by the module maintainer that created the entity, so if your custom entity would not work with it, then it will not break your site.

jenlampton commented 4 years ago

No reviews since October means this issue is getting bumped to 1.16. :(

docwilmot commented 4 years ago

FWIW I did review this, last week, and it was fine except for some small things.

docwilmot commented 4 years ago

See the PR itself for the review.

jenlampton commented 4 years ago

Ah, sorry @docwilmot I was reading this issue which is still labeled pr - needs code review and has not had a comment since October. Adjusting labels accordingly.

quicksketch commented 4 years ago

I updated the PR based on @docwilmot's review. Thanks for that!

I also found new calls to cache_clear_all() that should be replaced with the newer and more explicit cache() function instead.

As this could have far-reaching consequences I don't think we should rush it into 1.15, but I'm marking this RTBC for 1.16! We can merge it shortly after branching 1.15.x (which will happen Jan 15).

hosef commented 4 years ago

I updated the documentation for hook_entity_info() in entity.api.php so that it shows the new entity cache key and I made the documentation for field cache easier to understand since it could be read multiple ways before.

klonos commented 4 years ago

I also found new calls to cache_clear_all() that should be replaced with the newer and more explicit cache() function instead.

Are we deprecating cache_clear_all() over cache() in general, or is it only specific use cases? Either way, we should be documenting when this is advised, and/or throwing warnings for developers to change occurrences of cache_clear_all() when appropriate.

quicksketch commented 4 years ago

Are we deprecating cache_clear_all() over cache() in general, or is it only specific use cases?

Yes, I think we should. There is an issue at #2158.

quicksketch commented 4 years ago

It's now been a few weeks since 1.15.0 came out and we've branched for 1.15.x. I've merged https://github.com/backdrop/backdrop/pull/2927 into 1.x for 1.16.0! Thanks so much @hosef for taking the lead on this and really following through! Super awesome!

Thanks @matt2000 for the earliest work. Thanks @docwilmot and @oadaeh for your feedback and reviewing!

quicksketch commented 4 years ago

I'm reopening this until we have a change-record to document how it works and what it changes. In particular describing the way that things like the "new" attribute on comments now has to be set after the cache is loaded, rather than being set in the database loading function.

jenlampton commented 4 years ago

I was going to take a stab at creating a change record for this, but I'm afraid I don't really know how to summarize what changed. I was going to copy what went into D8's change record, but apparently they don't have one either.

oadaeh commented 4 years ago

@hosef told me he was going to create a change record for this, since he believed it was not a big deal. I'll just continuously nag him until he gets it done. :)

klonos commented 4 years ago

@hosef added a draft change record here: https://api.backdropcms.org/change-records/entity-caching-added-to-core

quicksketch commented 4 years ago

Looks good! I edited it slightly and added a section on the load() method of storage controllers.

It's now published. Thanks @hosef!

jenlampton commented 4 years ago

So, this issue can be closed now?