tnc-br / ddf-isoscapes

4 stars 0 forks source link

Fraudulent samples #152

Closed gretaabib closed 1 year ago

gretaabib commented 1 year ago
  1. Create fraudulent samples
  2. Combine fraudulent and real samples in a Dataframe, identified by 'fraud' column
  3. Call t-test function that expects this Dataframe with the fraud column
review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

benwulfe commented 1 year ago

I think we'd want a majority of the validation code to be within ddf_common.py. This will allow all colabs to call validation with a newly created isoscape. This can be done over time, but it means we should aim to place as much as possible within ddf_common.

gretaabib commented 1 year ago

@benwulfe I've implemented the suggestions that you made, could you please double check?

gretaabib commented 1 year ago

@erickzul I've implemented the creation of 5 fake samples per tree, could you please double check? Also the results of the t-test improved (maybe when you run the t-tests the results will better than mine, because I am pretty sure than I am using an obsolete version of xgboost in the tests): accuracy 0.5411764705882353 precision 0.7526881720430108 recall 0.56