microsoft / psi

Platform for Situated Intelligence
https://github.com/microsoft/psi/wiki
Other
542 stars 97 forks source link

[Bug] PsiStore Concatenate likely to be Broken #126

Open xiangzhi opened 3 years ago

xiangzhi commented 3 years ago

I'm trying to concatenate two PsiStores that were first split in half using PsiStoreTools's crop functionality. When I ran the concatenation, the application didn't work.

The first error:

System.ArgumentException
  HResult=0x80070057
  Message=Originating Lifetime Overlap
  Source=Microsoft.Psi
  StackTrace:
   at Microsoft.Psi.PsiStore.Concatenate(IEnumerable`1 storeFiles, ValueTuple`2 output, IProgress`1 progress, Action`1 loggingCallback) in C:\Users\Zhi\source\repos\CMU-TBD\psi\Sources\Runtime\Microsoft.Psi\Data\PsiStore.cs:line 347

I was fairly confident that my stores do not have overlapping time and I checked the code (PsiStore.cs:line 331):

// validate types match across stores and stream lifetimes don't overlap
foreach (var stream in group)
{
    totalMessageCount += stream.MessageCount;
    loggingCallback?.Invoke($"  Partition: {stream.PartitionName} {stream.Id} ({stream.TypeName.Split(',')[0]}) {stream.FirstMessageOriginatingTime}-{stream.LastMessageOriginatingTime}");
    if (group.GroupBy(pair => pair.TypeName).Count() != 1)
    {
        throw new ArgumentException("Type Mismatch");
    }

    foreach (var crosscheck in group)
    {
        var originatingLifetime = crosscheck.MessageCount == 0 ? TimeInterval.Empty : new TimeInterval(crosscheck.FirstMessageOriginatingTime, crosscheck.LastMessageOriginatingTime);
        if (crosscheck != stream && originatingLifetime.IntersectsWith(originatingLifetime))
        {
            throw new ArgumentException("Originating Lifetime Overlap");
        }
    }
}

The checking function seemed to be checking against itself originatingLifetime.IntersectsWith(originatingLifetime. This should be an easy fix.

var originatingLifetime = stream.MessageCount == 0 ? TimeInterval.Empty : new TimeInterval(stream.FirstMessageOriginatingTime, stream.LastMessageOriginatingTime);
foreach (var crosscheck in group)
{
    var corsscheckOriginatingLifetime = crosscheck.MessageCount == 0 ? TimeInterval.Empty : new TimeInterval(crosscheck.FirstMessageOriginatingTime, crosscheck.LastMessageOriginatingTime);
    if (crosscheck != stream && originatingLifetime.IntersectsWith(corsscheckOriginatingLifetime))
    {
        throw new ArgumentException("Originating Lifetime Overlap");
    }
}

After that fix, the application threw a different error:

System.InvalidOperationException
  HResult=0x80131509
  Message=Source component added when pipeline already running. Consider using Subpipeline.
  Source=Microsoft.Psi
  StackTrace:
   at Microsoft.Psi.Pipeline.GetOrCreateNode(Object component) in C:\Users\Zhi\source\repos\CMU-TBD\psi\Sources\Runtime\Microsoft.Psi\Executive\Pipeline.cs:line 813

I wasn't able to isolate why it happen and I suspect it is because the Linq function at 304 was ran multiple times. There also seemed to be other errors.

Please let me know if y'all can reproduce my error or need anything else. Here are the arguments I used when running it:

concat -p E:\Data\Lab-Store\2021-04-14\phantom-body-test -d Cropped.0001\Cropped;Cropped.0002\Cropped