zazwaz12 / CITS3200---National-Housing-Simulation

National Housing Simulation - mapping data points from the G-NAF and the census data sets.
0 stars 0 forks source link

Issue 23 allocate building coordinates to a sa1 region #56

Closed ctmes closed 1 month ago

ctmes commented 2 months ago

Title: "Closes #23, #21"

Summary of Changes

Added polars conversion of joining GNAF example data with shapefiles, logging included with configurations.yml file for user choice selection.

Optional Further Details

join_on_sa1.py:

  1. Load Configuration: Reads a YAML file (configurations.yaml) to load configuration settings such as file paths and parameters.
  2. Read Shapefile: Reads a shapefile and logs the total number of rows using Polars and Geopandas. I'm not sure if there geopolars has equivalent functions/processes.
  3. Read CSV Data: Loads a CSV file into a Polars DataFrame and logs the top records.
  4. Prepare Geospatial Points: Converts longitude and latitude data from the CSV into geospatial points using Geopandas and reprojects them to a specified CRS (Coordinate Reference System).
  5. Chunked Data Processing: Breaks geospatial data into chunks and uses parallel processing (via pathos) to spatially join point data with area data (e.g., SA2/SA1 regions).
  6. Nearest Point Join: For points that don't fall within any region, it performs a nearest spatial join to assign them the closest area.
  7. Random Redistribution: Randomly redistributes points within their respective areas for a specified number of iterations, shuffling longitude and latitude values.
  8. Visualize Results: Plots both the original and redistributed positions of points for selected area codes, showing the spatial shift in points.
  9. Save Results: Joins the processed data with the original DataFrame and exports the final result to a CSV file.
  10. Log Summary: Logs key information such as the number of missing area assignments and the results of the random redistribution.

configurations.yml:

  1. Logging Configuration: Sets up logging to output logs to a file with timestamps, backtrace, and diagnostic information at the "DEBUG" level, with logs retained for 7 days.
  2. File Paths: Specifies paths to different datasets, including census data, GNAF data, a shapefile, a CSV file for houses, and the output CSV file location.
  3. CRS and Processing Parameters: Defines the coordinate reference system (CRS) as "EPSG:7844" and configures the number of cores for parallel processing and iterations for random redistribution.
  4. Shapefile and CSV Input/Output: Provides the paths to the input shapefile (SA1_2021_AUST_GDA2020.shp) and houses CSV, as well as the output path for the processed CSV file.

Screenshots

If applicable, add screenshots to help visualize the changes.

Checklist:

Additional context

Add any other context or relevant information about the pull request here.

ctmes commented 2 months ago

IDK how to fix string.py, # type: ignore won't ignore it all

ctmes commented 2 months ago

No testing done!!

Current set up for testing:

image

But there are some issues in main.py, path.py, string.py that I don't know how to fix

SodaVolcano commented 2 months ago

IDK how to fix string.py, # type: ignore won't ignore it all

Did you do git merge main? #49 should have fixed the issues but your PR is reverting changes from that PR

ctmes commented 2 months ago

IDK how to fix string.py, # type: ignore won't ignore it all

Did you do git merge main? #49 should have fixed the issues but your PR is reverting changes from that PR

Yeah, I've been trying to pull in a variety of ways but it doesn't seem to be doing anything image

git merge origin/main and git pull and git fetch say the same thing too

SodaVolcano commented 2 months ago

IDK how to fix string.py, # type: ignore won't ignore it all

Did you do git merge main? #49 should have fixed the issues but your PR is reverting changes from that PR

Yeah, I've been trying to pull in a variety of ways but it doesn't seem to be doing anything image

git merge origin/main and git pull say the same thing too

Did you checkout main and do git pull to fetch updates from main before merging? Or maybe you got a merge conflict for strings.py and path.py and overwritten the changes from main with your own branch?

ctmes commented 2 months ago

IDK how to fix string.py, # type: ignore won't ignore it all

Did you do git merge main? #49 should have fixed the issues but your PR is reverting changes from that PR

Yeah, I've been trying to pull in a variety of ways but it doesn't seem to be doing anything image git merge origin/main and git pull say the same thing too

Did you checkout main and do git pull to fetch updates from main before merging? Or maybe you got a merge conflict for strings.py and path.py and overwritten the changes from main with your own branch?

I did checkout main, git pull; this did nothing since I git fetch whenever I open vscode anyway. I don't see any merge conflicts.

Strangely, I did revert a commit locally, then git pulled and it pulled it properly.

ctmes commented 2 months ago

IDK how to fix string.py, # type: ignore won't ignore it all

Did you do git merge main? #49 should have fixed the issues but your PR is reverting changes from that PR

Yeah, I've been trying to pull in a variety of ways but it doesn't seem to be doing anything image git merge origin/main and git pull say the same thing too

Did you checkout main and do git pull to fetch updates from main before merging? Or maybe you got a merge conflict for strings.py and path.py and overwritten the changes from main with your own branch?

I did checkout main, git pull; this did nothing since I git fetch whenever I open vscode anyway. I don't see any merge conflicts.

Strangely, I did revert a commit locally, then git pulled and it pulled it properly.

image There is this though for some reason, even tho I've fetched and pulled and pushes multiple times

SodaVolcano commented 2 months ago

image There is this though for some reason, even tho I've fetched and pulled and pushes multiple times

Yeah that's weird, I can't merge from main either so you likely got a merge conflict and did something with it? Try checking out afb612d4dff363d72f1a1365d051557bd78bd67d (i.e. revert one commit) and do git merge, that will give you a merge conflict whcih you can fix

ctmes commented 2 months ago

OK, I have no idea what is going on, I manually copied and edited the fixed versions logging.py and string.py files, which worked. Then I was just debugging the newread_config function, then I saw that now the file join_on_sa1.py has errors now which didn't exist before

ctmes commented 2 months ago

OK, I have no idea what is going on, I manually copied and edited the fixed versions logging.py and string.py files, which worked. Then I was just debugging the newread_config function, then I saw that now the file join_on_sa1.py has errors now which didn't exist before

EDIT: OK I'm goated It's fixed

ctmes commented 2 months ago

OK, I have no idea what is going on, I manually copied and edited the fixed versions logging.py and string.py files, which worked. Then I was just debugging the newread_config function, then I saw that now the file join_on_sa1.py has errors now which didn't exist before

EDIT: OK I'm goated It's fixed

ok to be more precise, I manually copied the working main, path, string.py files so they match (integrating the read_config function as well), and then applied type annotation to the join_on_sa1.py file. There should be no conflicts, but if there are I'll review and fix them

ctmes commented 1 month ago

@ctmes

Yup, just seen all the changes. I assume the EPSG in the testing is just an arbitrary value since all other values are arbitrary?

SodaVolcano commented 1 month ago

@ctmes

Yup, just seen all the changes. I assume the EPSG in the testing is just an arbitrary value since all other values are arbitrary?

yup it's just there to fill the parameter