labordynamicsinstitute / ssc-mirror

Mirror of SSC
Other
3 stars 1 forks source link

Get a list of all files on SSC #3

Open larsvilhuber opened 9 months ago

larsvilhuber commented 9 months ago

It is useful to have a full list of all SSC-stored Stata commands. This is derived from the list of files. Thus, creating an up-to-date list of all files on the SSC mirror when creating the mirror is easy enough.

larsvilhuber commented 9 months ago

See 950b20bf1bf0222872a86108cae8043fb1c0b764

larsvilhuber commented 9 months ago

File is here: https://github.com/labordynamicsinstitute/ssc-mirror/blob/releases/sscfiles.txt but of course, of limited use, because it doesn't map the files into packages. For that, we need to parse the pkg files.

larsvilhuber commented 9 months ago

@sergiocorreia do you think you can adjust your Python mirror scripts to simply work on the existing downloaded mirror (as it exists when building the daily snapshots, or when you do git clone --depth 1 of this repo)?

Would seem simple enough to just crawl it without building the package specific ZIP files?