vors / jupyter-powershell

PowerShell language kernel for Jupyter
MIT License
132 stars 26 forks source link

Potential fix for cell caching issue #17

Closed chlafreniere closed 4 years ago

chlafreniere commented 4 years ago

I'm by no means an expert in PowerShell, but did some debugging through the kernel to try and understand the heuristic used to determine when outputs are complete for a given cell.

It looks the kernel looks for a '^' to determine the above condition. The trouble is, for multiline cell sources, it appears that n '^' characters will be returned, where n is the number of lines in the cell's source.

So, I'm just keeping track of the number of lines in the source, and only setting the stop_flag to True when the expected carets goes below 1.

vors commented 4 years ago

Oh good finding! It's slightly more complicated actually. The number of expected special chars is going to be equal to the number of separate ps commands, not the number of lines. It's possible to have a single command spanning multiple lines. One common example would be

Get-ChildItem | % {
   Write-Host $_
}

Hm, I cannot come up with a clever hack from the top of my head for that. I think the culprit needs to be addressed at the level of how we writing input in the stdin pipeline: it should be possible to tell PS that the newline is part of a bigger command. In this case, the prompt should be called only a single time at the end of all commands.

chlafreniere commented 4 years ago

Thanks, @vors!

I've been thinking about how writing input the stdin pipeline and am a bit stuck with that solution for now that doesn't involve some crazy regex about determining all of the permutations of how multiline powershell commands can be formatted, and then adding in something like a backtick at appropriate times.

Stepping back, is there another way to solve this problem? Can we add a last command ($error? Get-History | ExecutionStatus? $write-host [(some UUID)]?? something else entirely?) to what we pass powershell, and then once we see the output of the command, we can be sure that the previous commands completed? It would be easy enough to filter out that last command. I'm just not familiar enough with async/Jobs/etc in ps to know if that could potentially work.

chlafreniere commented 4 years ago

if (((Get-History).({ $_.ExecutionStatus -notin 'Completed', 'Stopped' }) | measure).Count -eq 0) { "No runniing commands!" }

may be interesting, but then we're polling at the end, which seems not ideal.

vors commented 4 years ago

I was thinking more about it today too, I think there is an easy workaround.

We should be able to wrap the whole cell in the & {...} block. That would guarantee that there is a single output command.

If anybody wants to try out this solution and let me know does it work / send PR, that would be awesome (and a strong point for becoming a potential maintainer, see #16 ).

chlafreniere commented 4 years ago

Thanks @vors! Let me try this out and I'll get back to you 😄

chlafreniere commented 4 years ago

@vors I tested this and couldn't find any issues. Attached a simple notebook that exhibited symptoms before the fix, and doesn't appear to have the same issues afterwards.

I reached out to @SQLVariant previously around notebooks that also had issues previously, and they didn't repro afterwards.

Deferring to your expertise here, but this definitely looks promising 😄

Archive.zip

vors commented 4 years ago

Awesome, looks like we figured it out. 👍

yualan commented 4 years ago

Hi @vors, we are super excited about this kernel, and so many people are excited to see this in Powershell notebooks. Do you think you can help us make this possible with a new release?

vors commented 4 years ago

I tried to make a quick release, but turned out that I don't remember how :D #18 I hope somebody who can become a maintainer can step up and help.