cyrus-and / chrome-har-capturer

Capture HAR files from a Chrome instance
MIT License
535 stars 90 forks source link

Support running in Docker, add postData and retries #54

Closed georg-malahov closed 6 years ago

georg-malahov commented 6 years ago

TODO:

cyrus-and commented 6 years ago

Hi and thanks for this, I understand that this probably comes from your current use case but there are several things that don't work for me. I cannot merge this PR as it is.

First and foremost I'm not sure I want to maintain the Docker (+ Compose) setup in this repo. There is a lot of boilerplate and duplicate code in the Docker setup here. And I don't agree on always using the trunk version of Chrome; this is a user choice IMHO.

The ESLint fix breaks the code.

The retry feature may be useful, but I don't like the implementation and printing to stderr from the library is not right. Moreover, you're providing default values for the CLI but not for the library itself.

Removing window.open is an arbitrary choice that can be implemented in a preHook if really needed. Also, I think that the CDP provides a better solution with Page.setDownloadBehavior.

As you say POST data needs to be improved; there is the hasPostData and the maxPostDataSize parameters to take into consideration and the MIME type is hard-coded to an unlikely value.

georg-malahov commented 6 years ago

Hi! First of all, thank you for this great package and a very detailed response to my PR. I absolutely don't mind if you skip and close this PR in its current stage.

I think that the most valuable here is the idea of retries, I will really appreciate if you you add this feature. POST data is also something the community may need IMHO.

You are absolutely right that it all comes from my current use case, I just wanted it to be shared in case someone else may need it.

The biggest problem that I faced while analysing about a thousand URLs in parallel of 5 or more is that chrome sometimes crash unexpectedly. That is where the idea to run chrome in Docker container with restarts on-failure and chrome-har-capturer's retries came from. Another problem was that not all available Docker images of chrome could be run without --privileged flag and therefore used more widely. I would really appreciate if can point to another more suitable Docker image with chrome that is more suitable for CI infrastructure.

I will definitely star this project and contribute more someday. Thanks again for your great work!

cyrus-and commented 6 years ago

Sorry for the late reply and thanks.

I think that the most valuable here is the idea of retries, I will really appreciate if you you add this feature. POST data is also something the community may need IMHO.

I finally added both features, I will wait to bump a new release because I wanted to also include another pending feature.

Another problem was that not all available Docker images of chrome could be run without --privileged flag and therefore used more widely. I would really appreciate if can point to another more suitable Docker image with chrome that is more suitable for CI infrastructure.

AFAIK starting from, say debian, and installing Chrome from the official .deb should be basically enough.