Open ochameau opened 6 years ago
You have to run it through https://analysis.telemetry.mozilla.org, unfortunately. The repo is here, and you configure it and run it from a python notebook. If you'd like to go through with that, I can write up detailed instructions, but it's not a remarkably user-friendly process. However, the code you're interested in is here. It's relatively simple - just python that, given the data for a particular hang, returns True or False. Note that the "stack" argument is unsymbolicated, so it will only be useful for filtering JS stacks. Symbolicating tends to explode the time taken by the job, so I can get it to work, but I was planning to avoid doing so until someone actually requires it.
For now I only care about JS stacks. Most of DevTools is written in Javascript. If you think I can run that locally and get results, yes, I'm ready to invest some time in setting this up. The tracked.py logic is trivial, I'll need more help on overall setup of this program.
Kk - I'll try to give a step-by-step:
repo_dir = "tmp"
repo_https_url = "https://github.com/ochameau/background-hang-reporter-job"
sc.defaultParallelism
!rm -rf $repo_dir
!git clone $repo_https_url $repo_dir && cd $repo_dir && python setup.py bdist_egg
import os
distpath = repo_dir + '/dist'
sc.addPyFile(os.path.join(distpath, os.listdir(distpath)[0]))
import background_hang_reporter_job
from datetime import datetime, timedelta
background_hang_reporter_job.etl_job_tracked_stats(sc, sqlContext, {
'start_date': datetime.strptime("20170901", "%Y%m%d"),
'end_date': datetime.today(),
'sample_size': 0.01, # change this to 1.0 when you're done. Just using a low sample for dev
'hang_profile_out_filename': 'historical_data_TEST',
'exclude_modules': True,
})
tracked.py
in your clone. You'll need to commit, push, and close your notebook and restart it in order to pick up the changes, unfortunatelyThanks for the detailed steps. I'll give that a try tomorrow.
I finally found some time to test this and it seems to work fine. Thanks a ton for your detailed steps!! I just ran with 0.01 samples and just pushed another run with 1.0. It took about 3 hours to complete, I've no idea how much time it will take for a full scan...
Yeah, it definitely takes some time. The scheduled job uses a cluster of 16 nodes for this reason. If you're running for a 1.0 sample on a single node it probably won't finish within the lifetime of your cluster. I would say a 0.01 sample should be sufficient for testing though, if you'd like to submit the PR and I can get it into the scheduled job. (If you want to do more testing though, you could play with spinning up a larger cluster, or just set a more recent value for "start_date").
I would like to start tracking some specific stacks related to DevTools. But I'm not sure they will be all worth tracking and I may have many to track. I was wondering what was the machinery behind the "tracked hangs" section? Is this something I can setup locally on my machine?