Closed kylebarron closed 6 years ago
First script should download pages that are linked to from the main data documentation pages, i.e.
urls_dict = { 'Beneficiary Summary File': 'https://www.resdac.org/cms-data/files/mbsf/data-documentation', 'Carrier RIF': 'https://www.resdac.org/cms-data/files/carrier-rif/data-documentation', 'Durable Medical Equipment RIF': 'https://www.resdac.org/cms-data/files/dme-rif/data-documentation', 'Home Health Agency RIF': 'https://www.resdac.org/cms-data/files/hha-rif/data-documentation', 'Hospice RIF': 'https://www.resdac.org/cms-data/files/hospice-rif/data-documentation', 'Inpatient RIF': 'https://www.resdac.org/cms-data/files/ip-rif/data-documentation', 'MedPAR RIF': 'https://www.resdac.org/cms-data/files/medpar-rif/data-documentation', 'Outpatient RIF': 'https://www.resdac.org/cms-data/files/op-rif/data-documentation', 'Skilled Nursing Facility RIF': 'https://www.resdac.org/cms-data/files/snf-rif/data-documentation' }
Store those files in a local repository (probably git ignored), and probably store the files with their markdown anchor address.
So for example:
medicare-documentation/ ---- docs/ ---- data/ ---- ---- resdac/ ---- ---- ---- rif-pages/ ---- ---- ---- variables/
Then also store a dataset with links between which RIF pages reference which variables, and the local location at which they're stored.
Resolved in https://github.com/kylebarron/medicare-documentation/commit/1c9d53af039f832b931b93821f2f33b78a70cec2
First script should download pages that are linked to from the main data documentation pages, i.e.
Store those files in a local repository (probably git ignored), and probably store the files with their markdown anchor address.
So for example:
Then also store a dataset with links between which RIF pages reference which variables, and the local location at which they're stored.