mpastell / Pweave

Pweave is a scientific report generator and a literate programming tool for Python. It can capture the results and plots from data analysis and works well with numpy, scipy and matplotlib.
http://mpastell.com/pweave
Other
435 stars 64 forks source link

Bug/Issue: matplotlib charts bulk outputted at the end. #108

Open dsanalytics opened 6 years ago

dsanalytics commented 6 years ago

Trying to see if there's a solution to an issue causing all matplotlib charts to be outputted at the end instead of in order specified. E.g. processing includes looping through many calculations with each iteration outputting some text (parameters values, etc) followed by charts. This is shown properly in atom results windows, but when the same code is run through pweave, it outputs all text first and then all charts at the end.

In other words, here’s what Hydrogen Output Area in Atom shows:

text1 text2 chart1 chart2 text3 text4 chart3 chart4

and, here's how it looks in pweave html output:

text1 text2 text3 text4 chart1 chart2 chart3 chart4

There are no crashes or error reports. Please advise if there's a workaround or a fix - thanks.

Env: Windows 7/10 64bit Home Python: 3.6 64bit Pweave: 0.4.2

piccolbo commented 6 years ago

Maybe chunk option term=True

mpastell commented 6 years ago

This is the intended behavior. will force the output to appear after each statement. The reason for this is that plots can be updated in the chunk and usually you'd want only one figure as an output e.g. from the following dummy example

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 2*np.pi)
cx = np.cos(x)
sx = np.sin(x)
plt.plot(x, sx)
plt.plot(x, cx)

Using term=True works only partially and a pull request fixing this would be welcome. I don't have time work on it myself at the moment.

dsanalytics commented 6 years ago

@mpastell Thanks for chiming in - I understand. I cannot make one chart for data to be displayed is different and iterations are different - just imagine your code example wrapped in a for-x-in-y loop.

Are we sure that this is not matplotlib? I initially thought that some extra matplotlib setting and/or call would fix this. I'm already calling show() at the end of each iteration plotting so that is not the fix.

@piccolbo term=True made it even worse - got some name error and code did not even execute properly, while term=False works without problems.

mpastell commented 6 years ago

It's not matplotlib, but the current implementation in Pweave: https://github.com/mpastell/Pweave/blob/eddee3b6f9024432dff85fc22e834c926a5adab1/pweave/formatters/base.py#L191

dsanalytics commented 6 years ago

@mpastell - you mentioned 'will force the output to appear after each statement' - and that's exactly what I want and that is exactly what Atom does. However, pweave seems to be postponing all matplotlib output till the end of html generation.

So it does appear that pweave is not forcing output to appear in the matplotlib case - is that the issue with format_code_chunks(...)?

P.S. If important to know, the code that executes all this processing is in a separate python file and I'm just including it in pmd with source=... Unfortunately, I don't want to change that for my primary work is in Atom and pweave is just for reporting.

dsanalytics commented 6 years ago

@mpastell - to be even more precise with symptoms: html output is not mixed between chunks, but rather output of each chunk is shuffled within if matplotlib is used.

E.g. 3 chunks in a pmd: 1) header (outputs text-h) 2) middle (python file referenced with source= outputs text1, chart1, text2, chart2) 3) footer (outputs text-f)

generated html looks like this: text-h text1 text2 chart1 chart2 text-f

while it should be looking like this: text-h text1 chart1 text2 chart2 text-f

mpastell commented 6 years ago

Yes, it's clear what happens as that's how it is currently implemented. You need to change the formatter in order to fix that.

dsanalytics commented 6 years ago

@mpastell so, what'd be the time/effort estimate - assuming being done by someone already familiar with the code (which I'm not). I.e. I'm hoping that some other experienced contributors would have some free time to fix this.