18F / FAC-Distiller

Federal Audit Clearing House Distiller
2 stars 4 forks source link

Rework to use CFDA numbers instead of agency prefix #6

Closed fureigh closed 4 years ago

fureigh commented 5 years ago

User story

As an agency representative, I want to see results that are specific to my subagency, not the entire parent agency, so that I don't have to manually filter them.

Acceptance criteria

Outstanding questions

Implementation notes

~For testing purposes: DOT's prefix is 20 and DOT FTA's CFDA prefix is 20.5.~

If we allowed user accounts, we could save a user's subagency and not require them to enter the associated CFDA numbers each time.

For testing purposes, here are some CFDA numbers for DOT FTA grants.

fureigh commented 5 years ago

Update: I learned that CFDA prefixes are not reliably correlated with subagencies. Each agency assigns them differently.

This means the "minimally" above is the way to go. We'll have to provide a way for users to enter CFDA numbers. Goodbye, single-step dropdown menu.

fureigh commented 5 years ago

In case it's relevant: https://catalog.data.gov/dataset/catalog-of-federal-domestic-assistance-cfda

fureigh commented 4 years ago

CFDA stands for "Catalog of Federal Domestic Assistance." CFDA numbers index grants, essentially.

fureigh commented 4 years ago

(Updated outstanding questions and implementation notes, including sample CFDA numbers.)

bpdesigns commented 4 years ago

It might be possible to map the two digit CFDA#s to Parent Agencies, then for sub agencies map to the "Federal Agency (030)" column on the "assistance listing" csv.

csv located here

cantsin commented 4 years ago

Here's a quick and dirty python script to parse the above CSV -- I wrote this when trying to determine CFDA uniqueness.

import csv
from collections import defaultdict

cfdas = defaultdict(set)

with open("AssistanceListings_DataGov_PUBLIC_CURRENT.csv", "r+", encoding="ISO-8859-1") as f:
    cr = csv.reader(f)
    headers = [header.strip() for header in next(cr)]
    cfda_index = headers.index('Program Title')
    cfda_name_index = headers.index('Program Number')
    for line in cr:
        cfda = line[cfda_index]
        name = ' '.join(line[cfda_name_index].split()) # remove spaces
        cfdas[cfda].update([name])

# look for inconsistencies
for key in cfdas:
    value = cfdas[key]
    if len(value) > 1:
        print(key,value)

Result:

Cultural Resources Management {'15.159', '15.511', '15.946'}
Challenge Cost Share {'15.642', '15.238', '15.943'}
National Fire Plan-Wildland Urban Interface Community Fire Assistance {'15.948', '15.674'}
International Labor Programs {'17.401', '17.007'}
Organization of American States Programs {'19.129', '19.948'}
Post-9/11 Veterans Educational Assistance {'64.027', '64.028'}
Multiple Approaches to Support Young Breast Cancer Survivors and Metastatic Breast Cancer Patients {'93.373', '93.374'}
Prevention and Control of Chronic Disease and Associated Risk Factors in the U.S. Affiliated Pacific Islands, U.S. Virgin Islands, and P. R. {'93.377', '93.792'}
danielnaab commented 4 years ago

Wanting to make sure I understand this correctly....

Since there are only 8 program titles that have separate CFDA numbers, I looked into the possibility that while they share names, that may still be distinct entities.

The duplicates seem to correspond to distinct programs/grants, which would indicate we don't need special treatment for them. Am I understanding this correctly?

For example, here are descriptions for the first row above, "Cultural Resources Management".

15.159

To ensure the proper management, protection, and preservation of cultural resources over which the BIA maintains responsibility; furnish secure, short-term housing and care for cultural resources recovered during investigations; provide for the curation, stewardship, and public access to BIA museum collections and other cultural resources, including the increase of public awareness, appreciation, and knowledge of these resources.

{
    "list": [
        {
            "act": {
                "description": "Archaeological Resources Protection Act of 1979, as amended (ARPA), 16 U.S.C. §§ 470aa–mm; The Lacey Act of 1900, as amended, 16 U.S.C. §§ 3371-78, and 18 U.S.C. 42; Historic, archeologic, or prehistoric items and antiquities, 18 U.S.C. § 1866(b); Native American Graves Protection and Repatriation Act of 1990 (NAGPRA), 25 U.S.C. §§ 3001-3013; National Historic Preservation Act of 1966, as amended (NHPA), 54 U.S.C. § 300101 et seq.; Preservation of Historical and Archeological Data (Archeological and Historic Preservation Act of 1974, as amended), 54 U.S.C. §§ 312501-312508; Monuments, Ruins, Sites, and Objects of Antiquity  (Act for the Preservation of American Antiquities of 1906 (Antiquities Act)) 54 U.S.C. §§ 320301-320303."
            },
            "authorizationTypes": {
                "USC": false,
                "act": true,
                "executiveOrder": false,
                "publicLaw": false,
                "statute": false
            }
        }
    ]
}

15.511

To manage and protect cultural resources on Reclamation land; provide for the curation of and public access to collectible heritage assets, including the increase of public awareness, appreciation, and knowledge of these resources; and provide for the protection and preservation of the tribal cultural resources impacted by operations of some Reclamation projects.

{
    "list": [
        {
            "act": {
                "description": "National Historic Preservation Act of 1966, Pub. L. 89 665, as amended (54 U.S.C. 300101 et seq.); Archaeological and Historic Preservation Act of 1974, P.L. 93-291 (54 U.S.C. 312505 et seq.); Native American Graves Protection and Repatriation Act, P.L. 101-601 (25 U.S.C. 3001 et seq.)."
            },
            "authorizationTypes": {
                "USC": false,
                "act": true,
                "executiveOrder": false,
                "publicLaw": false,
                "statute": false
            }
        }
    ]
}

15.946

"The National Park Service (NPS) conducts cultural resource stewardship largely at the park level. To carry out and further this stewardship responsibility, the Service implements programs that encompass a broad range of research, operational, and educational activities. The NPS conducts:
Research to identify, evaluate, document, register, and establish basic information about cultural resources and traditionally associated peoples;
Planning to ensure that management processes for making decisions and setting priorities integrate information about cultural resources, and provide for consultation and collaboration with outside entities; and
Stewardship to ensure that cultural resources are preserved and protected, receive appropriate treatments (including maintenance), and are made available for public understanding and enjoyment."

{
    "list": [
        {
            "act": {
                "description": "54 U.S.C. §320102(f) commonly known as the American Antiquities Act; 54 U.S.C. §302304(b) State Historic Preservation Programs; 54 U.S.C. §101702(a) Cooperative Agreements, Transfer of Service Appropriated Funds; and; 54 USC §101701(b) Challenge Cost-share Agreement Authority."
            },
            "authorizationTypes": {
                "act": true
            }
        }
    ]
}
bpdesigns commented 4 years ago

Thanks @danielnaab I think you are correct. I just reviewed the CFDA#s for agencies as listed in the AssistanceListings_DataGov_PUBLIC_CURRENT.csv on the beta.sam website and it looks like this code is stating something like:

when what we should be connecting is

@cantsin let me know if I have this wrong

bpdesigns commented 4 years ago

@cantsin I think instead of cfda_index = headers.index('Program Title') cfda_name_index = headers.index('Program Number') we maybe want cfda_index = headers.index('Program Title') cfda_name_index = headers.index('Federal Agency (030)') since we want to match the Program number to the federal agency

bpdesigns commented 4 years ago

@danielnaab we don't need to sort by specific grant, or program, just the agency and subagency.

cantsin commented 4 years ago

Good catch. @danielnaab and @bpdesigns you are right here. Thanks for the correction!

bpdesigns commented 4 years ago