pysal / segregation

Segregation Measurement, Inferential Statistics, and Decomposition Analysis
https://pysal.org/segregation/
BSD 3-Clause "New" or "Revised" License
111 stars 26 forks source link

Commit of `kl_divergence_profile` function and walkthrough notebook - following Google Summer of Code 2021 #185

Closed noahbouchier closed 2 years ago

noahbouchier commented 2 years ago

Hi there,

This comprises the primary outputs of my Google Summer of Code (GSoC) project - working to implement a tool to calculate the KL Divergence Profile segregation metric from Olteanu et al.'s 2019 paper, 'Segregation through the multiscalar lens'

For further information on the project, please see the project's GitHub repository, including a summary document of the project's achievements.

Having worked hard to create this function, it is now in the home straight of review, refinement and feedback before being implemented into PySAL's library.

I look forward to responding to the feedback of @knaaptime and any other willing members of the PySAL community, working with them to get this function up to standard and passing all the tests.

I have greatly enjoyed my time as a PySAL GSoC intern, and hope to be able to continue both my contribution and personal growth through involvement with the PySAL community!

Best, Noah

codecov-commenter commented 2 years ago

Codecov Report

Merging #185 (88bd960) into master (ed263d4) will decrease coverage by 0.81%. The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #185      +/-   ##
==========================================
- Coverage   63.77%   62.96%   -0.82%     
==========================================
  Files         118      119       +1     
  Lines        4445     4477      +32     
==========================================
- Hits         2835     2819      -16     
- Misses       1610     1658      +48     
Impacted Files Coverage Δ
segregation/spatial/kl_divergence_profile.py 0.00% <0.00%> (ø)
segregation/inference/comparative.py 61.64% <0.00%> (-10.96%) :arrow_down:
segregation/multigroup/multi_gini.py 100.00% <0.00%> (ø)
segregation/multigroup/multi_dissim.py 100.00% <0.00%> (ø)
segregation/multigroup/multi_diversity.py 100.00% <0.00%> (ø)
segregation/multigroup/multi_divergence.py 100.00% <0.00%> (ø)
segregation/multigroup/multi_info_theory.py 100.00% <0.00%> (ø)
segregation/multigroup/multi_norm_exposure.py 100.00% <0.00%> (ø)
segregation/multigroup/simpsons_interaction.py 100.00% <0.00%> (ø)
segregation/multigroup/multi_squared_coef_var.py 100.00% <0.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update ed263d4...88bd960. Read the comment docs.

noahbouchier commented 2 years ago

Hey,

My most recent commits represent:

Following this commit, the code is ready for review. I hope to continue to update the workbook very soon, but please feel free to provide feedback on the current version.

Thanks so much

knaaptime commented 2 years ago

sorry this has taken me so long @noahbouchier. I've started writing this response on a few different occasions but never managed to press the button

This is great! Thanks for the contribution. The function itself looks great so i'd like to get it merged quickly, but there are a handful of small tweaks we need to make so that it conforms with the rest of the library. I am more than happy to just go ahead and make these changes myself since that will be the fastest, but if you'd rather do it yourself I can also just offer pointers about what needs to be done. Your call :)

In the current version of the package, we keep each index in a separate file. Check out the multigroup dissimilarity index for an example. So that's the biggest change and it'll be easier to track diffs once we move this into its own file

Here's a rough checklist of what we need to do

If you want me to just go ahead and handle that stuff, what I'll probably do is merge this PR so you get the contribution in the git history, then move the code into its own file and hammer out the necessary changes