Hardeepex / webscraper

1 stars 0 forks source link

Sweep: i want to use docker for selenium instead of java #13

Closed Hardeepex closed 10 months ago

Hardeepex commented 10 months ago
Checklist - [X] Create `Dockerfile` ✓ https://github.com/Hardeepex/webscraper/commit/b6c93296c1905385e9989cbf4bf73799ea639deb [Edit](https://github.com/Hardeepex/webscraper/edit/sweep/i_want_to_use_docker_for_selenium_instea/Dockerfile) - [X] Running GitHub Actions for `Dockerfile` ✓ [Edit](https://github.com/Hardeepex/webscraper/edit/sweep/i_want_to_use_docker_for_selenium_instea/Dockerfile) - [X] Modify `src/selenium_grid.py` ✓ https://github.com/Hardeepex/webscraper/commit/8ef2dc8c8d1b2bf75384412e67db313ba0a38205 [Edit](https://github.com/Hardeepex/webscraper/edit/sweep/i_want_to_use_docker_for_selenium_instea/src/selenium_grid.py#L7-L9) - [X] Running GitHub Actions for `src/selenium_grid.py` ✓ [Edit](https://github.com/Hardeepex/webscraper/edit/sweep/i_want_to_use_docker_for_selenium_instea/src/selenium_grid.py#L7-L9) - [X] Modify `README.md` ✓ https://github.com/Hardeepex/webscraper/commit/109949332a78460a5f2742f4b2dbf604594d2c0b [Edit](https://github.com/Hardeepex/webscraper/edit/sweep/i_want_to_use_docker_for_selenium_instea/README.md#L1-L1) - [X] Running GitHub Actions for `README.md` ✓ [Edit](https://github.com/Hardeepex/webscraper/edit/sweep/i_want_to_use_docker_for_selenium_instea/README.md#L1-L1)
sweep-ai[bot] commented 10 months ago

🚀 Here's the PR! #14

See Sweep's progress at the progress dashboard!
💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: b39de75539)

[!TIP] I'll email you at hardeep.ex@gmail.com when I complete this pull request!


Actions (click)

Sandbox Execution ✓

Here are the sandbox execution logs prior to making any changes:

Sandbox logs for 124e659
Checking src/selenium_grid.py for syntax errors... ✅ src/selenium_grid.py has no syntax errors! 1/1 ✓
Checking src/selenium_grid.py for syntax errors...
✅ src/selenium_grid.py has no syntax errors!

Sandbox passed on the latest main, so sandbox checks will be enabled for this issue.


Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description. https://github.com/Hardeepex/webscraper/blob/124e659f7be6d32b0d04d086bf40178dd65135c9/src/selenium_grid.py#L2-L13 https://github.com/Hardeepex/webscraper/blob/124e659f7be6d32b0d04d086bf40178dd65135c9/src/singleproduct.py#L14-L31 https://github.com/Hardeepex/webscraper/blob/124e659f7be6d32b0d04d086bf40178dd65135c9/src/rawyhtmlscraper.py#L2-L7

Step 2: ⌨️ Coding

Start the Selenium Grid

CMD ["start-selenium-grid.sh"]


</blockquote>

- [X] Running GitHub Actions for `Dockerfile` ✓  [Edit](https://github.com/Hardeepex/webscraper/edit/sweep/i_want_to_use_docker_for_selenium_instea/Dockerfile)
<blockquote>Check Dockerfile with contents:

Ran GitHub Actions for <a href="https://github.com/Hardeepex/webscraper/commit/b6c93296c1905385e9989cbf4bf73799ea639deb">b6c93296c1905385e9989cbf4bf73799ea639deb</a>:

</blockquote>

- [X] Modify `src/selenium_grid.py` ✓ https://github.com/Hardeepex/webscraper/commit/8ef2dc8c8d1b2bf75384412e67db313ba0a38205 [Edit](https://github.com/Hardeepex/webscraper/edit/sweep/i_want_to_use_docker_for_selenium_instea/src/selenium_grid.py#L7-L9)
<blockquote>Modify src/selenium_grid.py with contents:<br/>• Replace the `setup_selenium_grid()` function with a function that starts the Selenium Grid using Docker. This function should use the `subprocess` module to run the necessary Docker commands to start the Selenium Grid.<br/>• The new `setup_selenium_grid()` function should look something like this:
```python
def setup_selenium_grid():
    subprocess.Popen("docker build -t selenium-grid .", shell=True)
    subprocess.Popen("docker run -d -p 4444:4444 --name selenium-grid selenium-grid", shell=True)
--- 
+++ 
@@ -5,8 +5,8 @@

 def setup_selenium_grid():
-    subprocess.Popen("java -jar selenium-server-standalone.jar -role hub", shell=True)
-    subprocess.Popen("java -jar selenium-server-standalone.jar -role node  -hub http://localhost:4444/grid/register", shell=True)
+    subprocess.Popen("docker build -t selenium-grid .", shell=True)
+    subprocess.Popen("docker run -d -p 4444:4444 --name selenium-grid selenium-grid", shell=True)

 def get_webdriver():
     desired_cap = DesiredCapabilities.CHROME

Ran GitHub Actions for 8ef2dc8c8d1b2bf75384412e67db313ba0a38205:

  1. Build the Docker image for Selenium Grid:

    docker build -t selenium-grid .
  2. Run the Docker image for Selenium Grid:

    docker run -d -p 4444:4444 --name selenium-grid selenium-grid
--- 
+++ 
@@ -1,6 +1,18 @@
 # webscraper

 # Instructions
+
+## Using Docker for Selenium Grid
+
+1. Build the Docker image for Selenium Grid:
+   ```
+   docker build -t selenium-grid .
+   ```
+
+2. Run the Docker image for Selenium Grid:
+   ```
+   docker run -d -p 4444:4444 --name selenium-grid selenium-grid
+   ```
 # To install the dependencies using pip and the requirements.txt file, run the following command:
 # pip install -r requirements.txt

Ran GitHub Actions for 109949332a78460a5f2742f4b2dbf604594d2c0b:


Step 3: 🔁 Code Review

I have finished reviewing the code for completeness. I did not find errors for sweep/i_want_to_use_docker_for_selenium_instea.


🎉 Latest improvements to Sweep:


💡 To recreate the pull request edit the issue title or description. To tweak the pull request, leave a comment on the pull request. Join Our Discord