FalcorJSON $__status is 'resolved' when model progressively returns partial response from cache

jameslaneconkling commented 7 years ago

Is the FalcorJSON.$__status property supposed to reliably determine whether or not a model query is done emitting?

I'm running into the following issue:

model cache is populated with some results, e.g. [search, 1, { to: 9 }, title]
query model that emits progressively for paths, some of which can resolve in the local cache

model.get(['search', 1, { to: 19 }, 'title']).progressively()
  .subscribe({
    next(falcorJSON => {
      console.log('response:', falcorJSON.json.search[1]);
      console.log('status:', falcorJSON.json.$__status);
    })
  });

model emits twice, first with partial response from cache, second with total response. But each time w/ resolved status

// > response: { 0: { title: '...' }, ..., 9: { title: '...' } }
// > status: resolved
// sometime later...
// > response: { 0: { title: '...' }, ..., 19: { title: '...' } }
// > status: resolved

Is this a correct usage of $__status?

jameslaneconkling commented 7 years ago

Interesting, $__status does work as expected if recycleJSON is true.

I can put together a failing test case, if helpful.

trxcllnt commented 7 years ago

@jameslaneconkling you're intuition about $__status is right, it's a metadata flag that should indicate whether a streaming request is still waiting to hear back about some paths under the current branch node. We use it to optionally render loading indicators in React components at different levels in the render tree.

I thought I had test coverage, but it looks like I only have tests covering the recycleJSON mode. I'll add a test for non-recycle mode and see if I can't isolate the issue. Thanks for filing this issue!

trxcllnt commented 7 years ago

@jameslaneconkling ok, I think I fixed the issue. I just published v2.9.7 to npm, let me know if this doesn't solve the problem for you.

jameslaneconkling commented 7 years ago

So, the above issue was failing two of my tests with recycleJSON set to false. The 2.9.7 release passes one of the tests, but still fails on the second.

// fails 2.9.6 but passes 2.9.7 when recycleJSON is false
test('Should emit progressively for query in local cache and remote service', (t) => {
  t.plan(2);
  const {
    stream: change$,
    handler: graphChange
  } = createEventHandler();

  const model = new Model({
    source: {
      get: () => Observable.of({
        paths: [['items', 1, 'title']],
        jsonGraph: {
          items: {
            1: { $type: 'ref', value: ['item', 'b'] }
          },
          item: {
            b: { title: 'Item B' }
          }
        }
      }).delay(100)
    },
    cache: {
      items: {
        0: { $type: 'ref', value: ['item', 'a'] }
      },
      item: {
        a: { title: 'Item A' }
      }
    },
    recycleJSON: RECYCLEJSON,
    onChange: graphChange
  });

  const paths = () => [['items', [0, 1], 'title']];

  const expectedResults = [
    {
      graphFragment: { json: { items: { 0: { title: 'Item A' } } } },
      graphFragmentStatus: 'next',
      id: 2
    },
    {
      graphFragment: { json: { items: { 0: { title: 'Item A' }, 1: { title: 'Item B' } } } },
      graphFragmentStatus: 'complete',
      id: 2
    }
  ];

  withGraphFragment(paths, model, change$)(Observable.of({ id: 2 }))
    .subscribe(tapeResultObserver(t, RECYCLEJSON)(expectedResults));
});

# Should emit progressively for query in local cache and remote service
  ✖ emission 2 should match expected output
    operator: deepEqual
    expected: |-
      { graphFragment: { json: { items: { 0: { title: 'Item A' }, 1: { title: 'Item B' } } } }, graphFragmentStatus: 'complete', id: 2 }
    actual: |-
      { id: 2, graphFragment: { json: { items: { 0: { title: 'Item A' }, 1: { title: 'Item B' } } } }, graphFragmentStatus: 'next' }

// fails both 2.9.6 and 2.9.7 when recycleJSON is false
test('Should emit next when props change updates path', (t) => {
  t.plan(4);
  const {
    stream: change$,
    handler: graphChange
  } = createEventHandler();

  let i = 0;
  const model = new Model({
    source: {
      get: () => {
        i += 1;
        if (i === 1) {
          return Observable.of({
            paths: [['items', { to: 1 }, 'title'],['items', 'length']],
            jsonGraph: {
              items: {
                0: { $type: 'ref', value: ['item', 'a'] },
                1: { $type: 'ref', value: ['item', 'b'] },
                length: 50
              },
              item: {
                a: { title: 'Item A' },
                b: { title: 'Item B' }
              }
            }
          }).delay(100);
        } else if (i === 2) {
          return Observable.of({
            paths: [['items', 2, 'title']],
            jsonGraph: {
              items: {
                2: { $type: 'ref', value: ['item', 'c'] }
              },
              item: {
                c: { title: 'Item C' }
              }
            }
          }).delay(100);
        }

        return Observable.throw('Shouldn\'t run');
      }
    },
    recycleJSON: RECYCLEJSON,
    onChange: graphChange
  });

  const paths = ({ range }) => [['items', range, 'title'], ['items', 'length']];

  const expectedResults = [
    {
      graphFragment: {},
      graphFragmentStatus: 'next',
      range: { to: 1 }
    },
    {
      graphFragment: { json: { items: { 0: { title: 'Item A' }, 1: { title: 'Item B' }, length: 50 } } },
      graphFragmentStatus: 'complete',
      range: { to: 1 }
    },
    {
      graphFragment: { json: { items: { 0: { title: 'Item A' }, 1: { title: 'Item B' }, length: 50 } } },
      graphFragmentStatus: 'next',
      range: { to: 2 }
    },
    {
      graphFragment: { json: { items: { 0: { title: 'Item A' }, 1: { title: 'Item B' }, 2: { title: 'Item C' }, length: 50 } } },
      graphFragmentStatus: 'complete',
      range: { to: 2 }
    },
  ];

  withGraphFragment(paths, model, change$)(
    Observable.of({ range: { to: 1 } })
      .concat(Observable.of({ range: { to: 2 } }).delay(200))
  )
    .subscribe(tapeResultObserver(t, RECYCLEJSON)(expectedResults));
});

  ✖ emission 3 should match expected output
    operator: deepEqual
    expected: |-
      { graphFragment: { json: { items: { 0: { title: 'Item A' }, 1: { title: 'Item B' }, length: 50 } } }, graphFragmentStatus: 'next', range: { to: 2 } }
    actual: |-
      { range: { to: 2 }, graphFragment: { json: { items: { 0: { title: 'Item A' }, 1: { title: 'Item B' }, length: 50 } } }, graphFragmentStatus: 'complete' }

There's some of my own application code in there. If helpful, I can rewrite the tests to just use the model so output is more comprehensible. The translation bt/ the model logic and my app is pretty light, though: falcorJSON.json.$__status === 'pending' | 'resolved translates to graphFragmentStatus: 'next' | 'complete'.

trxcllnt commented 7 years ago

@jameslaneconkling it looks like the second test may be expecting the wrong first value? I wouldn't expect progressive get to emit an empty JSON branch when none of the requested paths are in the cache, only when at least one is.

I have a demo in esnextb.in that I think replicates your test, and the output looks right to me:

const m = new Model({ source: getDataSource() });
const get1 = Observable.from(m.get(['items', 'length'], ['items', {to: 1}, 'title']).progressively());
const get2 = Observable.from(m.get(['items', 'length'], ['items', {to: 2}, 'title']).progressively());

Observable.concat(
  get1.do((x) => console.log(`get1:`, x.json.$__status, x.json.toString())),
  get2.do((x) => console.log(`get2:`, x.json.$__status, x.json.toString()))
)
.subscribe();

/*
> dataSource get 0
> get1: resolved {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"length":50}}
> get2: pending {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"length":50}}
> dataSource get 1
> get2: resolved {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"2":{"title":"Item C"},"length":50}}
*/

jameslaneconkling commented 7 years ago

@trxcllnt my bad. Yeah, the leading emit is application logic on my end: the withGraphFragment always emits an immediate, synchronous value, possibly w/ an empty graphFragment if none of the paths exist in the cache.

You're test does demonstrate what I'd expect (resolved, pending, resolved), though in my test case I'm seeing something else: resolved, resolved, resolved. I'll try to recreate outside of my app to ensure it's not a bug on my end.

jameslaneconkling commented 7 years ago

OK, have can recreate the issue:

const get1 = Observable.from(m.get(['items', {to: 1}, 'title'], ['items', 'length']).progressively());
const get2 = Observable.from(m.get(['items', {to: 2}, 'title'], ['items', 'length']).progressively());

/*

get1: resolved {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"length":50}}
get2: resolved {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"length":50}}
get2: resolved {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"2":{"title":"Item C"},"length":50}}
*/

Notice, the paths are switched compared to your version (['items', 'length'] comes second).

trxcllnt commented 7 years ago

@jameslaneconkling OH duh. I know exactly what's going wrong. Will work on a fix today. Thanks again 🥇 !

trxcllnt commented 7 years ago

@jameslaneconkling after investigating this a bit, I think I know why I punted on making this work for the old mode. The commit 1571eeb73b9ed02fa7a73b5af9154ac42a5d4543 will fix it for the case you've encountered here, but not for all scenarios. Specifically, it's only fixed when the first path is at least partially in the cache. If the first path is entirely missing, but the second path isn't, you'll see the behavior in this issue again.

To recap, this now works:

const get1 = m.get(['items', {to: 1}, 'title'], ['items', 'length']).progressively();
const get2 = m.get(['items', {to: 2}, 'title'], ['items', 'length']).progressively();
/*
> dataSource get 0
> get1: resolved {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"length":50}}
> get2: pending {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"length":50}}
> dataSource get 1
> get2: resolved {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"2":{"title":"Item C"},"length":50}}
*/

But if we change the first path of get2 to something that's fully not in the cache yet, you'll see this again:

const get1 = m.get(['items', {to: 1}, 'title'], ['items', 'length']).progressively();
const get2 = m.get(['items', 2, 'title'], ['items', 'length']).progressively();
/*
> dataSource get 0
> get1: resolved {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"length":50}}
> get2: resolved {"items":{"length":50}} // <-- uh-oh
> dataSource get 1
> get2: resolved {"items":{"2":{"title":"Item C"},"length":50}}
*/

The reason why is we try to avoid allocations during get by not creating JSON branch nodes when paths aren't in the cache. If the first path is entirely not in the cache, we don't create any JSON branches for it. Since we process paths independently from each other, when we go to create the branch nodes for the second path (['items', 'length']), its branches are marked as "resolved", since it doesn't have any information about the first path's cache misses.

The recycleJSON version works because we collapse requested paths into a single path tree (so all paths are represented in the same data structure). This is necessary for computing unique hashcodes, but is more expensive up-front. This isn't a problem in the real world because we can memoize the collapse, but this optimization isn't something the base falcor library has enough information to do internally.

In the degenerate case that we're calling get with recycleJSON set to true, but not using the @graphistry/falcor-query-syntax module to memoize the path tree creation, we eat the cost and do a just-in-time collapse. It's a bit slower, but at least it works.

I see two possible solutions to the problem of the general case:

You can turn on the materialized flag, which will always create intermediate branch nodes in the JSON for missing paths. This will also ensure Falcor onNext's your Observable immediately with a value, even if nothing's in the cache. The value will include all branch nodes requested, with undefined at each leaf position.
I can change the internals of get to always collapse paths into a single path tree, and reuse the same cache search function as recycleJSON: true. This fix will significantly impact get's performance, and will require changing a lot of unit tests that validate the requested missing paths are such-and-such. But since it's more correct, I'd be open to the idea of making it the default for progressive mode.

For reference, here's the output of the example from this issue in materialized mode:

const m = new Model({ source: getDataSource(), materialized: true });
const get1 = m.get(['items', {to: 1}, 'title'], ['items', 'length']).progressively();
const get2 = m.get(['items', 2, 'title'], ['items', 'length']).progressively();
/*
> get1: pending {"items":{"0":{},"1":{}}}
> dataSource get 0
> get1: resolved {"items":{"0":{"title":"Item A"},"1":{"title":"Item B"},"length":50}}
> get2: pending {"items":{"2":{},"length":50}}
> dataSource get 1
> get2: resolved {"items":{"2":{"title":"Item C"},"length":50}}
*/

jameslaneconkling commented 7 years ago

OK, I can confirm that the fix does pass my test case, but does not pass the case outlined above where the first path in the second query is completely missing from the cache.

So as I understand it, for paths that don't hit the cache, there is no way to preserve the information that there was a cache miss when the entire query emits (when recycleJSON and materialized are false), yeah?

On my end, I could just punt on this fix and start using recycleJSON, which I've been putting off b/c there are a few bug regressions when I do make the change (very likely in my own application code). Also, my perf tests weren't significantly better with recycleJSON, though based on the above, maybe because I'm using path array syntax rather than @graphistry/falcor-query-syntax I'm forgoing some of the perf boosts I should be expecting from recycleJSON?

Anyhow, I don't have a good answer to the above. guaranteed consistent behavior for recycleJSON: false would be nice, though I totally get it if the library prioritizes recycleJSON: true is the priority for optimizations and testing.

graphistry / falcor

FalcorJSON $__status is 'resolved' when model progressively returns partial response from cache #15