webrecorder / browsertrix

Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!
https://webrecorder.net/browsertrix
GNU Affero General Public License v3.0
204 stars 36 forks source link

[Bug]: FastAPI crawlconfig API - 405 Method Not Allowed #1595

Closed tkrn closed 8 months ago

tkrn commented 8 months ago

Browsertrix Version

v1.9.3-79a217b

What did you expect to happen? What happened instead?

When posting directly to the API though a terminal interface, the expected behavior is "Method Not Allowed".

tkrn@archive ~/tkrnctl $ http POST http://localhost:30870/api/orgs/37e3230e-5558-45a6-baea-ea3b631aa02f/crawlconfigs "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiI1ZDBkMmFiMC03ZjE5LTQ5Y2QtYWEyZi1hZDZhNmZhNGU3ZjciLCJhdWQiOiJidHJpeDphdXRoIiwiZXhwIjoxNzE1NTQxMjQxfQ.MEsknlJqOqjnjwvHeAqwMlmcdoamTO-Tf-18FAumagk" < /tmp/tkrnctl_vHlaUUaqGI/c6b139f4cb.json
HTTP/1.1 405 Method Not Allowed
Connection: keep-alive
Content-Length: 31
Content-Type: application/json
Date: Wed, 13 Mar 2024 20:06:46 GMT
Server: nginx/1.23.2
allow: GET

{
    "detail": "Method Not Allowed"
}

Step-by-step reproduction instructions

Through the Firefox developer tools I captured the API post to generate a crawlconfig json post. As I attempt to re-create this functionality after l login via httpie (pip package) I'm able to get a successful authorization token. With that same token, I cannot post the same json payload as originally just posted in the web interface. See additional details for my details.

Additional details

cat /tmp/tkrnctl_vHlaUUaqGI/c6b139f4cb.json

  "jobType": "seed-crawl",
  "name": "www.motherboard.cz",
  "description": null,
  "scale": 1,
  "profileid": "",
  "runNow": false,
  "schedule": "",
  "crawlTimeout": 0,
  "maxCrawlSize": 0,
  "tags": [
    "batch 1"
  ],
  "autoAddCollections": [],
  "config": {
    "seeds": [
      {
        "url": "http://www.motherboard.cz",
        "scopeType": "domain",
        "include": [],
        "extraHops": 1,
        "depth": 25
      }
    ],
    "scopeType": "domain",
    "useSitemap": true,
    "failOnFailedSeed": false,
    "behaviorTimeout": null,
    "pageLoadTimeout": 180,
    "pageExtraDelay": 5,
    "limit": null,
    "lang": "en",
    "exclude": [],
    "behaviors": "autoscroll,autoplay,autofetch,siteSpecific"
  }
}
ikreymer commented 8 months ago

I think you may be missing the trailing slash, it should be .../crawlconfigs/...

tkrn commented 8 months ago

Case closed. Thank you and I feel like a goof!