kylebarron / medicare-documentation

Unified Medicare documentation in a single responsive website
https://kylebarron.dev/medicare-documentation/
18 stars 4 forks source link

Download and store all relevant ResDAC pages #6

Closed kylebarron closed 6 years ago

kylebarron commented 6 years ago

First script should download pages that are linked to from the main data documentation pages, i.e.

urls_dict = {
    'Beneficiary Summary File':
        'https://www.resdac.org/cms-data/files/mbsf/data-documentation',
    'Carrier RIF':
        'https://www.resdac.org/cms-data/files/carrier-rif/data-documentation',
    'Durable Medical Equipment RIF':
        'https://www.resdac.org/cms-data/files/dme-rif/data-documentation',
    'Home Health Agency RIF':
        'https://www.resdac.org/cms-data/files/hha-rif/data-documentation',
    'Hospice RIF':
        'https://www.resdac.org/cms-data/files/hospice-rif/data-documentation',
    'Inpatient RIF':
        'https://www.resdac.org/cms-data/files/ip-rif/data-documentation',
    'MedPAR RIF':
        'https://www.resdac.org/cms-data/files/medpar-rif/data-documentation',
    'Outpatient RIF':
        'https://www.resdac.org/cms-data/files/op-rif/data-documentation',
    'Skilled Nursing Facility RIF':
        'https://www.resdac.org/cms-data/files/snf-rif/data-documentation'
}

Store those files in a local repository (probably git ignored), and probably store the files with their markdown anchor address.

So for example:

medicare-documentation/
---- docs/
---- data/
---- ---- resdac/
---- ---- ---- rif-pages/
---- ---- ---- variables/

Then also store a dataset with links between which RIF pages reference which variables, and the local location at which they're stored.

kylebarron commented 6 years ago

Resolved in https://github.com/kylebarron/medicare-documentation/commit/1c9d53af039f832b931b93821f2f33b78a70cec2