tnc-br / ddf-isoscapes

3 stars 0 forks source link

Threshold pvalue #172

Closed gretaabib closed 1 year ago

gretaabib commented 1 year ago

As mentioned in Ddf Research Trip Report - 8/11 doc, changed the code to meet the following request: "We have a system that detects fraud at a min radius between 500 to 1500 with X precision at Y recall."

Outputs were recorded in p_values threshold doc.

review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

rothn commented 1 year ago

Why are we bothering with a min radius at all as part of our problem framing per our prior discussions? If this is just so we can compare with another system, this could always be a side-output instead of a central part of the problem definition.

-Nicholas

On Wed, Aug 16, 2023, 1:43 PM review-notebook-app[bot] < @.***> wrote:

Check out this pull request on [image: ReviewNB] https://app.reviewnb.com/tnc-br/ddf-isoscapes/pull/172

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB https://www.reviewnb.com/?utm_source=gh

— Reply to this email directly, view it on GitHub https://github.com/tnc-br/ddf-isoscapes/pull/172#issuecomment-1681030684, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDTDNZDIZFGTBTJYSDJ2TTXVUBDTANCNFSM6AAAAAA3S4CVXM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

erickzul commented 1 year ago

Hey Greta, I made a copy with some suggested changes. Mainly, we want to make sure that the p value of the precision and recall match, as well as making sure in the pipeline we give the user the ability to set their p value. PTAL: https://colab.sandbox.google.com/drive/10pkvfYmoYOyTOD-wLlMos-QxcQgIDbK6?userstoinvite=benwu%40google.com&sharingaction=manageaccess&role=writer

erickzul commented 1 year ago

@rothn

Per discussion in ML working session, we are keeping a min_radius of 5km per spec (which is the end of the green buffer per go/ddf-isoscape-validation). We don't have an indeterminate band at this time.

https://screenshot.googleplex.com/3brZKhvq6HpUvws

gretaabib commented 1 year ago

@erickzul included your adjustments, removed duplicated cell and also parameterized radius variables. FYI - I guess this merge depends on validation_pipeline_plot_isoscape branch.

rothn commented 1 year ago

@erickzul I think what I meant was max_radius, not min_radius. I wonder why not model this uniformly over Brazil or at least the Amazon?

erickzul commented 1 year ago

Based on the scope of this PR, this looks good as it achieves what the PR seeks to achieve which is giving us a p value for a target precision or recall. Let's revisit the radius requirement in its own PR.