CityOfLosAngeles / open-sdg-data-starter

A starting point for the data repository of an Open SDG platform implementation.
MIT License
1 stars 6 forks source link

Update codebase to align w/ latest configuration of the parent project #44

Open jaylenw opened 4 months ago

jaylenw commented 4 months ago

Problem

The open-sdg-data-starter template repository has been update significantly. This repository will need to have applicable changes made to be up to date what's included on the template.

Solution

Refer to the updated template and update the codebase here as needed considering our unique configuration.

### Tasks
jaylenw commented 3 months ago

Hello @RV-LACity @naomiikd @kerrylacity , I ran python scripts/check_data.py and it completed with no issue and then proceeded to run python scripts/build_data.py. You will see the first half of the output for python scripts/build_data.py has deprecation warnings, it is something we won't attempt to resolve ourselves as we hope the parent project addresses it. For the latter half of the output, it suggests there is some duplication going on. This is something you two will have to resolve @kerrylacity @naomiikd .

vscode ➜ /workspaces/open-sdg-data-starter (ghi-44) $ python scripts/check_data.py
vscode ➜ /workspaces/open-sdg-data-starter (ghi-44) $ python scripts/build_data.py 
/home/vscode/.local/lib/python3.11/site-packages/frictionless/actions/describe.py:129: UserWarning: Function "describe_schema" is deprecated (use "Schema.describe").
  warnings.warn(message, UserWarning)
/home/vscode/.local/lib/python3.11/site-packages/frictionless/actions/describe.py:105: UserWarning: Function "describe_resource" is deprecated (use "Resource.describe").
  warnings.warn(message, UserWarning)
/home/vscode/.local/lib/python3.11/site-packages/frictionless/plugins/pandas/parser.py:64: FutureWarning: iteritems is deprecated and will be removed in a future version. Use .items instead.
  for name, dtype in dataframe.dtypes.iteritems():
/home/vscode/.local/lib/python3.11/site-packages/frictionless/actions/describe.py:76: UserWarning: Function "describe_package" is deprecated (use "Package.describe").
  warnings.warn(message, UserWarning)
00:00:00 - 3.3.2 - Duplicate values for year 2012: 625 and 625 in series: {'Gender': '', 'Race/ Ethnicity': '', 'Birthplace': '', 'Units': 'By Gender'}
00:00:00 - 3.3.2 - Duplicate values for year 2013: 661 and 661 in series: {'Gender': '', 'Race/ Ethnicity': '', 'Birthplace': '', 'Units': 'By Gender'}
00:00:00 - 3.3.2 - Duplicate values for year 2014: 586 and 586 in series: {'Gender': '', 'Race/ Ethnicity': '', 'Birthplace': '', 'Units': 'By Gender'}
00:00:00 - 3.3.2 - Duplicate values for year 2015: 602 and 602 in series: {'Gender': '', 'Race/ Ethnicity': '', 'Birthplace': '', 'Units': 'By Gender'}
00:00:00 - 3.3.2 - Duplicate values for year 2016: 550 and 550 in series: {'Gender': '', 'Race/ Ethnicity': '', 'Birthplace': '', 'Units': 'By Gender'}
00:00:00 - 4.1.2 - Duplicate values for year 2015: 32.6 and 32.1 in series: {'Units': "Bachelor's degree or higher", 'Gender': 'Female', 'Race/ Ethnicity': ''}
00:00:00 - 17.8.1 - Duplicate values for year 2018:  and 88.3 in series: {'Age': '', 'Race': '', 'Education Attainment': '', 'Employment Status': '', 'Units': 'Percentage by Employment'}
00:00:00 - 17.8.1 - Duplicate values for year 2018:  and 91.8 in series: {'Age': '', 'Race': '', 'Education Attainment': '', 'Employment Status': 'Employed', 'Units': 'Percentage by Employment'}
00:00:00 - 17.8.1 - Duplicate values for year 2018:  and 80.0 in series: {'Age': '', 'Race': '', 'Education Attainment': '', 'Employment Status': 'Not in Labor Force', 'Units': 'Percentage by Employment'}
00:00:00 - 17.8.1 - Duplicate values for year 2018:  and 87.6 in series: {'Age': '', 'Race': '', 'Education Attainment': '', 'Employment Status': 'Unemployed', 'Units': 'Percentage by Employment'}
00:00:00 - 17.6.1 - Duplicate values for year 2018:  and 88.3 in series: {'Age': '', 'Race': '', 'Education Attainment': '', 'Employment Status': '', 'Units': 'Percentage by Employment'}
00:00:00 - 17.6.1 - Duplicate values for year 2018:  and 91.8 in series: {'Age': '', 'Race': '', 'Education Attainment': '', 'Employment Status': 'Employed', 'Units': 'Percentage by Employment'}
00:00:00 - 17.6.1 - Duplicate values for year 2018:  and 80.0 in series: {'Age': '', 'Race': '', 'Education Attainment': '', 'Employment Status': 'Not in Labor Force', 'Units': 'Percentage by Employment'}
00:00:00 - 17.6.1 - Duplicate values for year 2018:  and 87.6 in series: {'Age': '', 'Race': '', 'Education Attainment': '', 'Employment Status': 'Unemployed', 'Units': 'Percentage by Employment'}

At the very least I see _site directory be created with files within it. I will leave the validity of those files to you two as well.

I will push up my changes that allowed for us to run the scripts to this branch in a bit.

kerrylacity commented 3 months ago

Thank you for bringing this item to our attention!

I will mark these SDG goals that they have duplicate entries and I'm sure when we go through the .CSV files, I can root them out.

Thanks Jaylen!

kerrylacity commented 3 months ago

@jaylenw Looking at the _site/en/data to resolve those duplication issues, I looked at 3-3-1.csv and I was wondering how to create note notation in python. Is it # ? I wanted to flag the following line for deletion to remedy the duplication error: 2012,,,,By Gender,625

I think that would solve one of our issues.

Thank you.

kerrylacity commented 3 months ago

image Further clarification, I see this line as redundant as well as in VS code.

jaylenw commented 3 months ago

@kerrylacity , yes, if you want to make a note or comment in a source Python file you will use the #. You can refer to https://www.simplilearn.com/tutorials/python-tutorial/comments-in-python#:~:text=Comments%20in%20Python%20are%20identified,a%20multi%2Dline%20comment%20block. for additional details.

kerrylacity commented 3 months ago

Say I wanted to make changes to the code in VS; what would be the next steps?

Do you have availability to meet Thursday?

Thank you!

jaylenw commented 3 weeks ago

@kerrylacity , my apologies, I have overlooked this message. Let me ping you.