bgunlp / qpl

Code and dataset for the paper "Semantic Decomposition of Question and SQL for Text-to-SQL Parsing" EMNLP Findings 2023 - Ben Eyal et al BGU CS NLP Group
https://www.cs.bgu.ac.il/~elhadad/nlpproj/
2 stars 0 forks source link

Error when parsing 'Gather Streams' logical op with 'Parallelism' physical op #7

Closed iMayK closed 4 months ago

iMayK commented 4 months ago

Hi,

I followed the instructions in the dataset_creation directory, and Step 1 completed successfully. However, I encountered an error during Step 2.

The error seems to be caused by a scenario not covered in dataset_creation/mssql-execution-plans-to-qpl/src/main/scala/com/beneyal/qpl/parsing.scala. Specifically, when logicalOp is 'Gather Streams' and physicalOp is 'Parallelism'.

Error:


(qpl) mk@i9:~/QPL/qpl/dataset_creation$ for SPLIT in {train,dev}; do scala-cli run mssql-execution-plans-to-qpl -- -s ../../spider -i output/${SPLIT}_spider_with_ep.json -o output/${SPLIT}_qpl.json; done
timestamp=2024-03-11T08:28:24.163576242Z level=ERROR thread=#zio-fiber-1 message="" cause="Exception in thread "zio-fiber-4" java.lang.RuntimeException: Unknown RelOp combo: (Gather Streams, Parallelism) for b9d21e43b3c79125415414018109487bef49eab7921dbfa0f9dff7b81656dc30: com.beneyal.qpl.parsing$.parseRelOp(parsing.scala:80)
com.beneyal.qpl.parsing$.parseTop(parsing.scala:183)
com.beneyal.qpl.parsing$.parseRelOp(parsing.scala:76)
com.beneyal.qpl.parsing$.parseExecutionPlan$$anonfun$1(parsing.scala:19)
scala.util.Try$.apply(Try.scala:210)
com.beneyal.qpl.parsing$.parseExecutionPlan(parsing.scala:19)
com.beneyal.qpl.reading$.$anonfun$112$$anonfun$3(reading.scala:70)
scala.util.Success.flatMap(Try.scala:258)
com.beneyal.qpl.reading$.$anonfun$112(reading.scala:70)
zio.Chunk$Arr.mapChunk(Chunk.scala:1753)
zio.ChunkLike.map(ChunkLike.scala:121)
zio.ChunkLike.map$(ChunkLike.scala:39)
zio.Chunk.map(Chunk.scala:42)
com.beneyal.qpl.reading$.readDataset$$anonfun$2$$anonfun$4$$anonfun$2$$anonfun$3(reading.scala:70)
zio.ZIO.map$$anonfun$1$$anonfun$1(ZIO.scala:960)
zio.UnsafeVersionSpecific.implicitFunctionIsFunction$$anonfun$1(UnsafeVersionSpecific.scala:27)
zio.Unsafe$.unsafe(Unsafe.scala:37)
zio.ZIOCompanionVersionSpecific.succeed$$anonfun$1(ZIOCompanionVersionSpecific.scala:185)
zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:904)
zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
java.base/java.lang.Thread.run(Thread.java:833)
    at com.beneyal.qpl.reading$.readDataset$$anonfun$2$$anonfun$4$$anonfun$2$$anonfun$4$$anonfun$1$$anonfun$2(reading.scala:77)
    at zio.Cause.map$$anonfun$1(Cause.scala:418)
    at zio.Cause.flatMap$$anonfun$2(Cause.scala:173)
    at zio.Cause$$anon$9.failCase(Cause.scala:288)
    at zio.Cause$$anon$9.failCase(Cause.scala:287)
    at zio.Cause.loop$2(Cause.scala:221)
    at zio.Cause.foldContext(Cause.scala:248)
    at zio.Cause.foldLog(Cause.scala:305)
    at zio.Cause.flatMap(Cause.scala:179)
    at zio.Cause.map(Cause.scala:418)
    at zio.ZIO.mapError$$anonfun$1(ZIO.scala:981)
    at zio.ZIO.mapErrorCause$$anonfun$1(ZIO.scala:993)
    at com.beneyal.qpl.reading.readDataset(reading.scala:75)
    at com.beneyal.qpl.reading.readDataset(reading.scala:78)
    at com.beneyal.qpl.reading.readDataset(reading.scala:79)
    at com.beneyal.qpl.reading.readDataset(reading.scala:80)
    at com.beneyal.qpl.reading.readDataset(reading.scala:81)
    at com.beneyal.qpl.plantoqpl.program(plantoqpl.scala:888)"
timestamp=2024-03-11T08:28:26.266294225Z level=ERROR thread=#zio-fiber-1 message="" cause="Exception in thread "zio-fiber-4" java.lang.RuntimeException: Unknown RelOp combo: (Gather Streams, Parallelism) for 8a97b1f2dce33f93402760747b78c771827c6f87f26cd8b79c78e074a69bf916: com.beneyal.qpl.parsing$.parseRelOp(parsing.scala:80)
com.beneyal.qpl.parsing$.parseTop(parsing.scala:183)
com.beneyal.qpl.parsing$.parseRelOp(parsing.scala:76)
com.beneyal.qpl.parsing$.parseExecutionPlan$$anonfun$1(parsing.scala:19)
scala.util.Try$.apply(Try.scala:210)
com.beneyal.qpl.parsing$.parseExecutionPlan(parsing.scala:19)
com.beneyal.qpl.reading$.$anonfun$112$$anonfun$3(reading.scala:70)
scala.util.Success.flatMap(Try.scala:258)
com.beneyal.qpl.reading$.$anonfun$112(reading.scala:70)
zio.Chunk$Arr.mapChunk(Chunk.scala:1753)
zio.ChunkLike.map(ChunkLike.scala:121)
zio.ChunkLike.map$(ChunkLike.scala:39)
zio.Chunk.map(Chunk.scala:42)
com.beneyal.qpl.reading$.readDataset$$anonfun$2$$anonfun$4$$anonfun$2$$anonfun$3(reading.scala:70)
zio.ZIO.map$$anonfun$1$$anonfun$1(ZIO.scala:960)
zio.UnsafeVersionSpecific.implicitFunctionIsFunction$$anonfun$1(UnsafeVersionSpecific.scala:27)
zio.Unsafe$.unsafe(Unsafe.scala:37)
zio.ZIOCompanionVersionSpecific.succeed$$anonfun$1(ZIOCompanionVersionSpecific.scala:185)
zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:904)
zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
java.base/java.lang.Thread.run(Thread.java:833)
    at com.beneyal.qpl.reading$.readDataset$$anonfun$2$$anonfun$4$$anonfun$2$$anonfun$4$$anonfun$1$$anonfun$2(reading.scala:77)
    at zio.Cause.map$$anonfun$1(Cause.scala:418)
    at zio.Cause.flatMap$$anonfun$2(Cause.scala:173)
    at zio.Cause$$anon$9.failCase(Cause.scala:288)
    at zio.Cause$$anon$9.failCase(Cause.scala:287)
    at zio.Cause.loop$2(Cause.scala:221)
    at zio.Cause.foldContext(Cause.scala:248)
    at zio.Cause.foldLog(Cause.scala:305)
    at zio.Cause.flatMap(Cause.scala:179)
    at zio.Cause.map(Cause.scala:418)
    at zio.ZIO.mapError$$anonfun$1(ZIO.scala:981)
    at zio.ZIO.mapErrorCause$$anonfun$1(ZIO.scala:993)
    at com.beneyal.qpl.reading.readDataset(reading.scala:75)
    at com.beneyal.qpl.reading.readDataset(reading.scala:78)
    at com.beneyal.qpl.reading.readDataset(reading.scala:79)
    at com.beneyal.qpl.reading.readDataset(reading.scala:80)
    at com.beneyal.qpl.reading.readDataset(reading.scala:81)
    at com.beneyal.qpl.plantoqpl.program(plantoqpl.scala:888)"
beneyal commented 4 months ago

Hello Mayank,

The problem you are facing happens because, by default, SQL Server uses some parallelization level, which creates these "Parallelism" nodes. In order to mitigate that, we need to turn off parallelism, i.e., set the parallelism level to 1.

This snippet will do just that (if your database's name is not spider, change the first line):

USE spider;
GO   
EXEC sp_configure 'show advanced options', 1;  
GO  
RECONFIGURE WITH OVERRIDE;  
GO  
EXEC sp_configure 'max degree of parallelism', 1;  
GO  
RECONFIGURE WITH OVERRIDE;  
GO

You only need to run this once per database. You can either run this snippet in SQL Server Management Studio or through the sqlcmd command line tool.