TiesdeKok / ipystata

Enables the use of Stata together with Python via Jupyter (IPython) notebooks.
192 stars 68 forks source link

Lines of output cut off #22

Closed mjfrigaard closed 7 years ago

mjfrigaard commented 7 years ago

Hello and thank you very much for ipystata. The integration is beautiful!

I keep running into issues with some of the output. In the python notebook I've attached, you'll see some of the results from the list commands are unclear/jumbled. I've also had issues with the output from some regressions being cut-off on the right-hand side.

I am just wondering if there is an easy fix for this?

Thank you for your time!

image

pacbard commented 7 years ago

I did some digging into this. If you run for example:

%%stata
sysuse auto
list in 1/10

you will see that some lines in the output get eliminated like in the picture above.

If you look at the output log, you will notice that all the lines that have a number followed by a period get eliminated. For our example, this is the raw log:

. list in 1/10

     +----------------------------------------------------------------------------------+
  1. | make          |  price | mpg | rep78 | headroom | trunk | weight | length | turn |
     | AMC Concord   |  4,099 |  22 |     3 |      2.5 |    11 |  2,930 |    186 |   40 |
     |----------------------------------------------------------------------------------|
     |         displa~t         |         gear_r~o          |          foreign          |
     |              121         |             3.58          |         Domestic          |
     +----------------------------------------------------------------------------------+
...

and this is the output in ipystata:

 +----------------------------------------------------------------------------------+     | AMC Concord   |  4,099 |  22 |     3 |      2.5 |    11 |  2,930 |    186 |   40 |
     |----------------------------------------------------------------------------------|
     |         displa~t         |         gear_r~o          |          foreign          |
     |              121         |             3.58          |         Domestic          |
     +----------------------------------------------------------------------------------+
...

The regex on line Line #80 seems to be responsible for replacing that line in the log with an empty line (see here for an explanation of the regex in question). This results in the behavior reported above.

The easy fix would be to remove the regex responsible for this behavior from the log processing function. The right fix would be to understand what that regex is supposed to eliminate and find a workaround for the case that you bring up.

TiesdeKok commented 7 years ago

Hi,

First of all, thanks @pacbard for already looking into this. Very helpful!

I think there are two separated problems here:

1) The problem of @mjfrigaard seems to be standard behavior when the width of the output is too long. Stata does the same thing, if the width of the output is bigger than the width of the output-area, it will force line breaks to make it fit. If you were to "zoom out" (for example by doing ctrl + mouse wheel) it should fix the jumbled up output. Another solution is to prevent the output from being too big if this occurs.

2) I think you, @pacbard, stumbled on another issue which indeed relates to the way that I parse the log-files. In all honesty, the regex for parsing a log-file is a bit of a mess currently. I mostly developed it using a trial-and-error approach testing it on all the different outputs I often use, obviously missing those types of output that I not normally use. I believe that I added that line to deal with loops, the input code is converted into a weird plain text representation with dots and numbers.

I will put it on my to-do list to look into the log file regex to make it support the list command.