mskcc / smile-server

2 stars 4 forks source link

Duplicate Tempo nodes on Samples #1201

Closed ao508 closed 1 week ago

ao508 commented 1 week ago

I'm going to walk Fox.. I'm going to add a note here though (will make it into a card when I get back).

Basically what I've found is that, at least in some cases, not all samples are getting imported/linked with a cohort due to a db exception occurring.

stack trace in smile-server

2024-06-24 19:30:52.428 ERROR 1 --- [pool-6-thread-4] .m.s.s.i.TempoMessageHandlingServiceImpl : Error during handling of Cohort complete event

org.springframework.dao.IncorrectResultSizeDataAccessException: Incorrect result size: expected at most 1
    at org.springframework.data.neo4j.repository.query.GraphQueryExecution$SingleEntityExecution.execute(GraphQueryExecution.java:73) ~[spring-data-neo4j-5.3.3.RELEASE.jar!/:5.3.3.RELEASE]
    at org.springframework.data.neo4j.repository.query.GraphRepositoryQuery.doExecute(GraphRepositoryQuery.java:76) ~[spring-data-neo4j-5.3.3.RELEASE.jar!/:5.3.3.RELEASE]
    at org.springframework.data.neo4j.repository.query.AbstractGraphRepositoryQuery.execute(AbstractGraphRepositoryQuery.java:57) ~[spring-data-neo4j-5.3.3.RELEASE.jar!/:5.3.3.RELEASE]
    at org.springframework.data.repository.core.support.QueryExecutorMethodInterceptor$QueryMethodInvoker.invoke(QueryExecutorMethodInterceptor.java:195) ~[spring-data-commons-2.3.3.RELEASE.jar!/:2.3.3.RELEASE]
    at org.springframework.data.repository.core.support.QueryExecutorMethodInterceptor.doInvoke(QueryExecutorMethodInterceptor.java:152) ~[spring-data-commons-2.3.3.RELEASE.jar!/:2.3.3.RELEASE]
    at org.springframework.data.repository.core.support.QueryExecutorMethodInterceptor.invoke(QueryExecutorMethodInterceptor.java:130) ~[spring-data-commons-2.3.3.RELEASE.jar!/:2.3.3.RELEASE]
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[spring-aop-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:367) ~[spring-tx-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:118) ~[spring-tx-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[spring-aop-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139) ~[spring-tx-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[spring-aop-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95) ~[spring-aop-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[spring-aop-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) ~[spring-aop-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at com.sun.proxy.$Proxy113.findTempoBySamplePrimaryId(Unknown Source) ~[na:na]
    at org.mskcc.smile.service.impl.TempoServiceImpl.getTempoDataBySamplePrimaryId(TempoServiceImpl.java:68) ~[service-0.1.0.jar!/:0.1.0]
    at org.mskcc.smile.service.impl.TempoServiceImpl$$FastClassBySpringCGLIB$$bee97256.invoke(<generated>) ~[service-0.1.0.jar!/:0.1.0]
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) ~[spring-core-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687) ~[spring-aop-5.2.8.RELEASE.jar!/:5.2.8.RELEASE]
    at org.mskcc.smile.service.impl.TempoServiceImpl$$EnhancerBySpringCGLIB$$c293d4aa.getTempoDataBySamplePrimaryId(<generated>) ~[service-0.1.0.jar!/:0.1.0]
    at org.mskcc.smile.service.impl.CohortCompleteServiceImpl.saveCohort(CohortCompleteServiceImpl.java:56) ~[service-0.1.0.jar!/:0.1.0]
    at org.mskcc.smile.service.impl.TempoMessageHandlingServiceImpl$CohortCompleteHandler.run(TempoMessageHandlingServiceImpl.java:274) ~[service-0.1.0.jar!/:0.1.0]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na]
    at java.base/java.lang.Thread.run(Thread.java:829) ~[na:na]

example cypher query of such a sample that's causing the error above:

MATCH (t:Tempo)<-[:HAS_TEMPO]-(s: Sample)-[:HAS_METADATA]->(sm: SampleMetadata) WHERE sm.primaryId = "12497_D_3" with s as s1, sm as sMetadata, count(t) as tCount WHERE tCount > 1 RETURN s1.smileSampleId,sMetadata.primaryId, tCount

Seems that duplicate Tempo nodes on a single sample are the root cause.

So that diagnoses the issue we're seeing but we'll (1) need to identify how many of these cases currently exist in the database, as they will continue throwing that IncorrectResultSizeDataAccessException and (2) run a script or a query that will identify and correct these cases by merging multiple Tempo nodes into a single node per sample. And of course (3) dig into how/why this occurred, how likely it is to happen again, etc


INITIAL FINDINGS

Samples affected in production: 1,207 * by comparison there are zero records affected in the dev server