Open motin opened 5 years ago
My preference here would be to include a flag that can disable these caps as they are less important for web crawling.
Do you have suggestions for these limits? The script/api combination cap is very liberal; it is mainly intended to prevent huge logs from scripts that call APIs in a never-ending loop (e.g. some scripts poll document.cookie
in a setTimeout loop until a certain cookie shows up)
My preference here would be to include a flag that can disable these caps as they are less important for web crawling.
Sure. A cap value of -1 could suggest a disabling of a cap.
Do you have suggestions for these limits?
Not yet, and I am all for keeping the default caps very liberal to begin with. These caps are meant to protect the integrity of the data aggregation pipeline when crawlers / users visit problematic sites.
Fyi, as a workaround in JESTr, we are adding a fail-safe consisting of batching http and js packets by web navigations. Together with a client-enforced 500kb limit on telemetry pings, we are be able to avoid extreme volumes of packets being sent due to certain scripts.
I'm submitting a ... [ ] bug report [X] feature request [ ] question about the decisions made in the repository [ ] question about how to use this project
Summary While we already have a cap # of calls logged for each script/api combination, we should ideally also have a cap on the amount of calls logged for each script, frame and tab.
A cap on script-level protects the data from exploding in size when scripts do frequent calls to many different apis (such as Bokeh which when used in notebooks easily can yield around 10k js log calls per script).
A cap on frame-level protects the data from exploding in size when frames load an unusual amount of scripts that each do frequent calls to instrumented API:s.
A cap on tab-level protects the data from exploding in size when tabs include a large amount of frames that load an unusual amount of scripts that each do frequent calls to instrumented API:s.
Ideally, an event should be emitted when a cap was reached in order to inform analysis.
Example of an notebook that would benefit from script-level capping: https://metrics.mozilla.com/~sbird/overscripted-clustering/clusters/working_assumption_and_refinement/precision%20bundling%20-%20part%201.html
Example of a page that would benefit from tab/frame-level capping: https://www.independent.co.uk