kiliman / tailwindui-crawler

tailwindui-crawler downloads the component HTML files locally
MIT License
773 stars 99 forks source link

419 error on login #64

Closed vanenshi closed 2 years ago

vanenshi commented 2 years ago

hey guys, I am getting 419 error on the response from login any idea how to fix that?

kiliman commented 2 years ago

Interesting... Looks like they've changed the login process a bit so my crawler isn't able to complete the login.

lampenmeister commented 2 years ago

Same Problem... 🔐 Logging into tailwindui.com... ⏱ 303ms (200) https://tailwindui.com/login 🚫 Invalid credentials

vanenshi commented 2 years ago

@kiliman Thank you for this great library Let me know if I can help you out

kiliman commented 2 years ago

Thanks... sorry for the delay. I worked on it a little and having some issues getting the fetch to send the correct info. They've changed from an embedded token, to using cookies for XSRF protection. I'm pretty sure I'm reading and sending the correct cookies, but still getting the error. When I send the request directly (curl-like), it works fine.

Anyway, hopefully I figure it out today.

kiliman commented 2 years ago

I got past the login, but it's not loading the component pages properly (acting as if I'm not logged in). I wonder if they are actively trying to prevent the crawler from working.

They've definitely made some changes that are making it hard to do.

kiliman commented 2 years ago

I'm stumped. If I copy the cookies from Chrome Dev Tools and fetch, it works fine. But if I get the cookies from set-cookie headers, it fails. Same with cURL. I copy the cookies and it works. But if I use the cookies from the fetch, it fails.

The cookies look the same, (same format, different values), but it's not working :(

akoenig commented 2 years ago

@kiliman, tailwind-crawler is awesome and it is sad that the changed login is now preventing the download of the components. We're using shuffle.dev for prototyping and it is relying on tailwind-crawler when it comes to using Tailwind UI components. Are you aware of an alternative for downloading the components?

kiliman commented 2 years ago

I figured out the issue but working on a solution. I don’t think it was a deliberate block because I’m still able to download if I manually set the required cookies.

akoenig commented 2 years ago

@kiliman Good to hear! Thanks for getting back! Can you elaborate a little bit on the workaround?

kiliman commented 2 years ago

The problem is that node fetch isn’t getting the entire session cookie value. It only returns about 1070 characters but the actual cookie is around 1400 characters. If I use curl or the browser it gets the full cookie. When I copy+paste that cookie in the crawler it works fine.

So need to figure out how to get the full cookie value. I’ve tried multiple node packages and they all have the same issue so believe it’s a node platform problem.

akoenig commented 2 years ago

Ah okay, understood. Thanks, @kiliman. So I managed to download all components by ...

  1. ... performing a login in the browser (https://tailwindui.com)
  2. ... commented this part (L240 - L245) out
  3. ... copied the tailwind_ui_session cookie via the browser dev tools
  4. ... added the Cookie header via options here
  5. ... performed a yarn start afterwards

🙂

kiliman commented 2 years ago

Yup, pretty much. Not sure why node can't read the full cookie. I may need to shell out to curl or something else to fetch the cookies.

jverghese commented 2 years ago

@kiliman Thanks for creating this repo. I tried embedding the cookie and it says Found 0 components on every page. Is the markup extraction still working?

cd-slash commented 2 years ago

I'm having the same issue after pasting the cookie - 0 components found. Also using the crawler to try to import components to import into shuffle.dev per their recommendation.

kiliman commented 2 years ago

Finally got this problem fixed. They made a lot of changes to the site. In addition to the login issue, they changed how the components are structured.

I also added the ability to download the Page Templates. See the README for details.

Thanks all for your patience.