0xdevalias / chatgpt-source-watch

Analyzing the evolution of ChatGPT's codebase through time with curated archives and scripts
https://github.com/0xdevalias/chatgpt-source-watch/blob/main/CHANGELOG.md
Other
274 stars 16 forks source link

Fix script for extracting CSS URLs from `webpack.js` + unpacking `*.css` files #6

Open 0xdevalias opened 9 months ago

0xdevalias commented 9 months ago

In the past there was only a single *.css URL extracted from webpack.js from the miniCssF field, so it was unpacked as miniCssF.css (as the *.css files hashes change every time they are re-built, and they don't seem to have a static chunk part to their filename when downloaded)

More recently, there have been new *.css files specific to certain chunks (sometimes shared among multiple chunks), and so the scripts for extracting this are broken and produce an entry like this:

https://cdn.oaistatic.com/_next/undefined

We also need to think about how best to name the files. I think main.css would probably work for the 'main' chunk (previously what we called miniCssF). For the *.css related to the other chunks, if they only applied to a single chunk I would probably have named them based on that chunk, but sometimes they are used in multiple chunks. If doing it manually we could probably figure out what they are used for and name them based on that, but not sure the best way to do this automatically. We can't use the hash of the *.css file, as that changes every time the file changes.

0xdevalias commented 9 months ago

Fixed extract-webpack-urls.js to handle both single and multiple CSS chunk urls in https://github.com/0xdevalias/chatgpt-source-watch/commit/1c127295d3aa2781b933a56d96bfb386380eb029

Still need to figure out the best way to rename the css chunks + update unpack-files-from-orig.js accordingly.