Cyberjusticelab / JusticeAI

JusticeAI (ProceZeus) is a web chat bot that aims to facilitate access to judicial proceedings involving Quebec tenant/landlord law
https://cyberjusticelab.github.io/JusticeAI/docs/rendered/
MIT License
21 stars 16 forks source link

Claim: Predict $ that may be obtained #251

Closed arekmano closed 6 years ago

arekmano commented 6 years ago

Description As a user, I would like the system to predict how much money a landlord may obtain from their tenant in a standard case.

The predicted amount should consider:

Scope of Work

Demo requirement

  1. The user does x
  2. The system does y n. Rest of a use case of a flow chart

Acceptance Criteria

Samuel-Campbell commented 6 years ago

Progress so far

INFO: Total precedents parsed: 44970 INFO: Total precedents with absent : 1779 INFO: Total precedents with apartment_impropre : 563 INFO: Total precedents with apartment_infestation : 0 INFO: Total precedents with asker_is_landlord : 31567 INFO: Total precedents with asker_is_tenant : 3081 INFO: Total precedents with bothers_others : 3 INFO: Total precedents with case_fee_reimbursement : 1226 INFO: Total precedents with disrespect_previous_judgement : 138 INFO: Total precedents with incorrect_facts : 26 INFO: Total precedents with landlord_inspector_fees : 513 INFO: Total precedents with landlord_notifies_tenant_retake_apartment : 384 INFO: Total precedents with landlord_pays_indemnity : 41 INFO: Total precedents with landlord_prejudice_justified : 18574 INFO: Total precedents with landlord_relocation_indemnity_fees : 673 INFO: Total precedents with landlord_rent_change : 2385 INFO: Total precedents with landlord_rent_change_doc_renseignements : 311 INFO: Total precedents with landlord_rent_change_piece_justification : 310 INFO: Total precedents with landlord_rent_change_receipts : 310 INFO: Total precedents with landlord_retakes_apartment : 1784 INFO: Total precedents with landlord_retakes_apartment_indemnity : 95 INFO: Total precedents with landlord_sends_demand_regie_logement : 582 INFO: Total precedents with landlord_serious_prejudice : 129 INFO: Total precedents with lease : 26289 INFO: Total precedents with proof_of_late : 30 INFO: Total precedents with proof_of_revenu : 2 INFO: Total precedents with rent_increased : 1 INFO: Total precedents with tenant_bad_payment_habits : 30458 INFO: Total precedents with tenant_continuous_late_payment : 4857 INFO: Total precedents with tenant_damaged_rental : 20 INFO: Total precedents with tenant_dead : 31 INFO: Total precedents with tenant_declare_insalubre : 147 INFO: Total precedents with tenant_financial_problem : 307 INFO: Total precedents with tenant_group_responsability : 1308 INFO: Total precedents with tenant_individual_responsability : 3976 INFO: Total precedents with tenant_is_bothered : 26 INFO: Total precedents with lack_of_proof : 4400 INFO: Total precedents with tenant_landlord_agreement : 1250 INFO: Total precedents with tenant_lease_fixed : 0 INFO: Total precedents with tenant_lease_indeterminate : 1167 INFO: Total precedents with tenant_left_without_paying : 7133 INFO: Total precedents with tenant_monthly_payment : 35942 INFO: Total precedents with tenant_negligence : 8 INFO: Total precedents with tenant_not_request_cancel_lease : 8 INFO: Total precedents with tenant_owes_rent : 21978 INFO: Total precedents with tenant_refuses_retake_apartment : 131 INFO: Total precedents with tenant_rent_not_paid_less_3_weeks : 2040 INFO: Total precedents with tenant_rent_not_paid_more_3_weeks : 20897 INFO: Total precedents with tenant_rent_paid_before_hearing : 1810 INFO: Total precedents with tenant_violence : 160 INFO: Total precedents with tenant_withold_rent_without_permission : 81 INFO: Total precedents with violent : 255 INFO: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ INFO: Total precedents with demand_lease_modification : 6 INFO: Total precedents with demand_resiliation : 28425 INFO: Total precedents with landlord_claim_interest_damage : 1674 INFO: Total precedents with landlord_demand_access_rental : 194 INFO: Total precedents with landlord_demand_bank_fee : 71 INFO: Total precedents with landlord_demand_damage : 3852 INFO: Total precedents with landlord_demand_legal_fees : 4716 INFO: Total precedents with landlord_demand_retake_apartment : 417 INFO: Total precedents with landlord_demand_utility_fee : 92 INFO: Total precedents with landlord_fix_rent : 2619 INFO: Total precedents with landlord_lease_termination : 25960 INFO: Total precedents with landlord_money_cover_rent : 16344 INFO: Total precedents with paid_judicial_fees : 3841 INFO: Total precedents with tenant_claims_harassment : 55 INFO: Total precedents with tenant_cover_rent : 25667 INFO: Total precedents with tenant_demands_decision_retraction : 1242 INFO: Total precedents with tenant_demand_indemnity_Code_Civil : 327 INFO: Total precedents with tenant_demand_indemnity_damage : 4750 INFO: Total precedents with tenant_demand_indemnity_judicial_fee : 34 INFO: Total precedents with tenant_demand_interest_damage : 1181 INFO: Total precedents with tenant_demands_money : 2837 INFO: Total precedents with tenant_demand_rent_decrease : 1189 INFO: Total precedents with tenant_respect_of_contract : 78 INFO: Total precedents with tenant_eviction : 23940 INFO: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ INFO: Total precedents with additional_indemnity_date : 0 INFO: Total precedents with additional_indemnity_money : 12452 INFO: Total precedents with declares_housing_inhabitable : 16 INFO: Total precedents with declares_resiliation_is_correct : 5526 INFO: Total precedents with orders_expulsion : 19417 INFO: Total precedents with orders_immediate_execution : 13376 INFO: Total precedents with orders_resiliation : 20029 INFO: Total precedents with orders_tenant_pay_first_of_month : 289 INFO: Total precedents with rejects_landlord_demand : 1720 INFO: Total precedents with rejects_tenant_demand : 1616 INFO: Total precedents with tenant_ordered_to_pay_landlord : 18062 INFO: Total precedents with tenant_ordered_to_pay_landlord_legal_fees : 15120 INFO: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Samuel-Campbell commented 6 years ago

Classifier training using the entire dataset

INFO: Column: additional_indemnity_money INFO: Test accuracy: 100.0% INFO: Precision: [ 1.] INFO: Recall: [ 1.] INFO: F1: [ 1.]

INFO: Column: declares_housing_inhabitable INFO: Test accuracy: 99.95552590615966% INFO: Precision: [ 0.99955526 0. ] INFO: Recall: [ 1. 0.] INFO: F1: [ 0.99977758 0. ]

INFO: Column: declares_resiliation_is_correct INFO: Test accuracy: 92.38381142984211% INFO: Precision: [ 0.96577801 0.64932886] INFO: Recall: [ 0.94744122 0.74351585] INFO: F1: [ 0.95652174 0.6932378 ]

INFO: Column: orders_expulsion INFO: Test accuracy: 96.708917055815% INFO: Precision: [ 0.95827039 0.97971877] INFO: Recall: [ 0.98543689 0.9425078 ] INFO: F1: [ 0.97166379 0.96075312]

INFO: Column: orders_immediate_execution INFO: Test accuracy: 88.92595063375583% INFO: Precision: [ 0.9480934 0.77408747] INFO: Recall: [ 0.89148634 0.88396545] INFO: F1: [ 0.91891892 0.82538569]

INFO: Column: orders_resiliation INFO: Test accuracy: 96.74227262619524% INFO: Precision: [ 0.95739348 0.98108747] INFO: Recall: [ 0.98570861 0.94413549] INFO: F1: [ 0.97134474 0.96225686]

INFO: Column: orders_tenant_pay_first_of_month INFO: Test accuracy: 99.36624416277519% INFO: Precision: [ 0.99366244 0. ] INFO: Recall: [ 1. 0.] INFO: F1: [ 0.99682115 0. ]

INFO: Column: rejects_landlord_demand INFO: Test accuracy: 96.93128752501667% INFO: Precision: [ 0.97149966 0.84210526] INFO: Recall: [ 0.99721384 0.33684211] INFO: F1: [ 0.98418882 0.48120301]

INFO: Column: rejects_tenant_demand INFO: Test accuracy: 97.30931732265955% INFO: Precision: [ 0.98019577 0.67307692] INFO: Recall: [ 0.9921659 0.44585987] INFO: F1: [ 0.98614451 0.53639847]

INFO: Column: tenant_ordered_to_pay_landlord INFO: Test accuracy: 100.0% INFO: Precision: [ 1.] INFO: Recall: [ 1.] INFO: F1: [ 1.]

dodoels commented 6 years ago

@Samuel-Campbell update scope of work with supervised training work

Samuel-Campbell commented 6 years ago

GridSearchCV results

It would appear that increasing C and epsilon infinitely will always yield better results. There is however an asymptote for this max which needs to be plotted. So far by trial/error those values seemed the most optimal. Tested on: 1000 data points

INFO: Column: additional_indemnity_date INFO: {'C': 9, 'kernel': 'linear', 'epsilon': 1.0}

INFO: Column: additional_indemnity_money INFO: {'C': 9, 'kernel': 'linear', 'epsilon': 1.0}

INFO: Column: tenant_ordered_to_pay_landlord INFO: {'C': 9, 'kernel': 'linear', 'epsilon': 1.0}

INFO: Column: tenant_ordered_to_pay_landlord_legal_fees INFO: {'C': 9, 'kernel': 'poly', 'epsilon': 1.0}

Samuel-Campbell commented 6 years ago

INFO: Regression Results: INFO: Column: additional_indemnity_date INFO: R2: -0.9108530969895694 INFO: Explained Variance: 8.467037558557156e-05

INFO: Column: additional_indemnity_money INFO: R2: 0.06058693097602019 INFO: Explained Variance: 0.11248639396117954

INFO: Column: tenant_ordered_to_pay_landlord INFO: R2: 0.05397737986027207 INFO: Explained Variance: 0.11476517798523378

INFO: Column: tenant_ordered_to_pay_landlord_legal_fees INFO: R2: -0.03279217687514646 INFO: Explained Variance: -0.02755565113675007

Samuel-Campbell commented 6 years ago

WEIGHTS

https://docs.google.com/spreadsheets/d/1JesTQv95ULaNJUhQm4nbClg7al0CUOd48JwsIkfL1N0/edit#gid=1223071576

arekmano commented 6 years ago

Exploring The usage of MLP

Value reported: Mean Squared Error (lower is better)

Standardized: -51.57 (5.92) MSE, model: intermed_model, epochs: 50 Standardized: -27.62 (7.40) MSE, model: adv_model, epochs: 50 Standardized: -22.65 (3.44) MSE, model: vadv_model, epochs: 50 Standardized: -26.38 (3.94) MSE, model: intermed_model, epochs: 100 Standardized: -20.34 (2.31) MSE, model: adv_model, epochs: 100 Standardized: -19.19 (2.00) MSE, model: vadv_model, epochs: 100

arekmano commented 6 years ago

Possible refinements that can be done to improve results:

  1. Create a binary classifier to determine whether or not the plaintiff would "succeed" or "fail" in obtaining money from the defendant.
  2. If "success", then a prediction would be obtained from the regressor.

In essence, I propose eliminating all the precedents that were non-successful in obtaining $ from the regressor's dataset.

arekmano commented 6 years ago

Using a basic SVM to classify Win/No Win

Size of dataset: 15227 Sample size: 15227 Train size: 12181 Test size: 3046

LINEAR Test accuracy: 62.17990807616546% Precision: [ 0.61008403 0.624643 ] Recall: [ 0.28293063 0.86840613] F1: [ 0.38658147 0.72662553]

BEST PARAMS {'C': 5, 'kernel': 'rbf'}

RBF, C=5 Test accuracy: 62.77084701247538% Precision: [ 0.68350168 0.6141925 ] Recall: [ 0.30029586 0.88902007] F1: [ 0.41726619 0.72648336]

RBF, C=7 Test accuracy: 63.591595535128036% Precision: [ 0.65693431 0.62981787] Recall: [ 0.33987915 0.86353078] F1: [ 0.44798407 0.72838599]

naregeff commented 6 years ago

It currently predicts a value, which demonstrates functionality. Accuracy will require improvement.

arekmano commented 6 years ago

Some Experiments with TenantPaysLandlordRegressor

Configuration

Existing Network

EPOCH: 100

R2: 0.71 Explained Variance: 0.71 Mean Absolute Error: 542.50 Mean Squared Error: 1496880.59


EPOCH: 200

R2: 0.72 Explained Variance: 0.72 Mean Absolute Error: 507.29 Mean Squared Error: 1407201.10


EPOCH: 300

R2: 0.73 Explained Variance: 0.73 Mean Absolute Error: 485.98 Mean Squared Error: 1388681.49


EPOCH: 400

R2: 0.73 Explained Variance: 0.73 Mean Absolute Error: 486.90 Mean Squared Error: 1357591.09


EPOCH: 500

R2: 0.74 Explained Variance: 0.74 Mean Absolute Error: 495.47 Mean Squared Error: 1343432.35


EPOCH: 600

R2: 0.74 Explained Variance: 0.74 Mean Absolute Error: 475.28 Mean Squared Error: 1308561.09


EPOCH: 700

R2: 0.74 Explained Variance: 0.74 Mean Absolute Error: 491.59 Mean Squared Error: 1326475.56


EPOCH: 800

R2: 0.75 Explained Variance: 0.75 Mean Absolute Error: 477.77 Mean Squared Error: 1291887.96


Configuration

Dense(64, input_dim=self.input_dimensions,kernel_initializer='normal', activation='relu')) Dense(64, kernel_initializer='normal', activation='relu')) Dense(32, kernel_initializer='normal', activation='relu')) Dense(1, kernel_initializer='normal'))

EPOCH: 100


R2: 0.74 Explained Variance: 0.74 Mean Absolute Error: 498.30 Mean Squared Error: 1332743.50


EPOCH: 200


R2: 0.77 Explained Variance: 0.77 Mean Absolute Error: 435.51 Mean Squared Error: 1152454.46


EPOCH: 300


R2: 0.79 Explained Variance: 0.79 Mean Absolute Error: 432.16 Mean Squared Error: 1063420.42


EPOCH: 400


R2: 0.81 Explained Variance: 0.81 Mean Absolute Error: 407.97 Mean Squared Error: 977754.54


EPOCH: 500


R2: 0.82 Explained Variance: 0.82 Mean Absolute Error: 390.73 Mean Squared Error: 926247.98


EPOCH: 600


R2: 0.83 Explained Variance: 0.83 Mean Absolute Error: 378.06 Mean Squared Error: 882072.70


EPOCH: 700


R2: 0.83 Explained Variance: 0.83 Mean Absolute Error: 356.07 Mean Squared Error: 879462.85


EPOCH: 800


R2: 0.83 Explained Variance: 0.83 Mean Absolute Error: 365.47 Mean Squared Error: 863401.15