Netflix / flamescope

FlameScope is a visualization tool for exploring different time ranges as Flame Graphs.
Apache License 2.0
3.02k stars 169 forks source link

Remove invalid chars from the stack file in medium.com blog (was - README example). #44

Closed thatsafunnyname closed 6 years ago

thatsafunnyname commented 6 years ago

Hello and thanks for flamescope, In https://medium.com/netflix-techblog/netflix-flamescope-a57ca19d47bb and https://github.com/Netflix/flamescope/blob/master/README.md#installation--instructions the example filename used is stacks.myproductionapp.2018-03-30_01 , (UPDATE - the first dash on the medium site is a non ASCII char), as of https://github.com/Netflix/flamescope/commit/cbd13e187d913cc522c512ed4725a4bd79209078 for https://github.com/Netflix/flamescope/issues/7 this is now invalid and the heatmap generation fails with a 500 from app/util/heatmap.py::read_offsets when fileutil.validpath returns false. I was thinking that in addition to a change to the filename used in the example (UPDATE - on medium.com), that an error could be printed to the terminal, something like:

     if not fileutil.validpath(path):
+        print("ERROR: File %s path %s is not valid" % (filename, path) )
         return abort(500)

Thanks.

Reproducer:

> cp examples/perf.test01 examples/stacks.myproductionapp.2018-03-30_01 (copy-and-paste from blog on medium.com)
> python run.py
Click on link for the heatmap.

Error in terminal is:

[16/Apr/2018 07:38:09] "GET /heatmap/?filename=stacks.myproductionapp.2018%E2%80%9303%E2%80%9330_01&rows=50 HTTP/1.1" 500 -

Error in the web app is a missing heatmap.

thatsafunnyname commented 6 years ago

Just wanted to point out that I think the reason the invalidchar regexp is failing is because the dash/hyphen "-" is a unicode codepoint while the regexp is not. If I prefix the regexp char group with a u then the dash/hyphen is allowed:

import re

invalidchars = re.compile('[^a-zA-Z0-9.,/_%+: -\\\\]') # does not allow unicode dash
#invalidchars = re.compile('u[^a-zA-Z0-9.,/_%+: -\\\\]') # allows unicode dash
pathname="201804" + u"\u2013" + "16"
#pathname="201804" + "-" + "16" # works
print pathname

ret = invalidchars.search(pathname)
if ret:
    print("ERROR: pathname '%s' is not valid: '%s' matched: '%s' type(pathname)='%s' " % (pathname, ret , ret.group(0) , type(pathname) ) ) 

UPDATE:

The source of the non ASCII dash was a copy-and-paste from the "Instructions" section of: https://medium.com/netflix-techblog/netflix-flamescope-a57ca19d47bb Note that the https://github.com/Netflix/flamescope/blob/master/README.md page does not contain a non ASCII dash:

medium.com

> echo stacks.myproductionapp.2018–03–30_01 | hexdump -c
0000000   s   t   a   c   k   s   .   m   y   p   r   o   d   u   c   t
0000010   i   o   n   a   p   p   .   2   0   1   8 342 200 223   0   3
0000020 342 200 223   3   0   _   0   1  \n
0000029

github.com

> echo stacks.myproductionapp.2018-03-30_01 | hexdump -c
0000000   s   t   a   c   k   s   .   m   y   p   r   o   d   u   c   t
0000010   i   o   n   a   p   p   .   2   0   1   8   -   0   3   -   3
0000020   0   _   0   1  \n
0000025
spiermar commented 6 years ago

So from your point of view, the issue is Medium containing unicode characters? Or that the tool should support unicode?

spiermar commented 6 years ago

This https://github.com/Netflix/flamescope/issues/18 issue is being used to track the exception handling part.

thatsafunnyname commented 6 years ago

The issue for me was the blog post on medium.com containing a non ASCII dash in the filename. If an error had been printed about an invalid filename then I would have tried a different (ASCII) filename earlier. I don't use unicode (deliberately) for filenames, so support for it is not an issue for me.

spiermar commented 6 years ago

Medium does that unfortunately, and not just for dashes. Once I get a bit of time, I'll work on issue #18

@brendangregg can you remove the dashes from the Medium example?

brendangregg commented 6 years ago

Fixed. Changed to underscores (Medium would not let me insert a dash there.)