e-mission / e-mission-eval-private-data

Evaluate the e-mission platform along various metrics
BSD 3-Clause "New" or "Revised" License
1 stars 11 forks source link

Update clustering.py #37

Closed humbleOldSage closed 11 months ago

humbleOldSage commented 1 year ago

Changes in clustering.py file to shift dependency from hlu09's tour_model_extended to main branch's trip_model. Still need to change type of data being passed to fit function for this to work. Marked with a TODO. Explained in detail at https://github.com/e-mission/e-mission-eval-private-data/issues/35#issuecomment-1674290156

humbleOldSage commented 1 year ago

Have you run the code?

Yes

Do you get the same graphs as the paper?

Yet to confirm

Please indicate "testing done".

Ongoing. The way I am planning to test this is I'll match and compare labels generated by both custom branch and master branch. This will verify that master branch and custom branch are functioning similarly.

Is there any other way I can test this ?

humbleOldSage commented 1 year ago

Do you get the same graphs as the paper?

They differ. Let me check why this is happening.

humbleOldSage commented 1 year ago

Tested. This is running with no errors. Can confirm this generates the same results.

humbleOldSage commented 1 year ago
  • is this the only notebook that is affected by the change? I know that we have a notebook which generates the performance (accuracy/F-score) of various algorithms. I would expect that it would also need to be changed...

Almost All the other notebooks have dependencies on this module

humbleOldSage commented 1 year ago
  • I would like to see more information in the PR issue that it works (screenshots, information about the model indicating that it works)

Screenshot from the latest run, so no errors.

screencapture-localhost-8888-notebooks-clustering-examples-ipynb-2023-08-20-16_14_14

humbleOldSage commented 1 year ago

Left is current result. Right is from research paper. Suburban 50m.

Suburban 100m

Suburban 150m

humbleOldSage commented 1 year ago

Left is current result. Right is from research paper. College 50m.

College 100m

College 150m

humbleOldSage commented 1 year ago

4 notebooks,

generate_figs_for_poster.ipynb SVM_decision_boundaries.ipynb cluster_performance.ipynb clustering_example.ipynb

are now working correctly.

humbleOldSage commented 1 year ago

These are the results I got from running get_performance_for_poster.ipynb Screen Shot 2023-08-30 at 3 36 45 PM

humbleOldSage commented 1 year ago

PARTIALLY TESTED -- I have tested all the notebook except the one that'll take 2 days to run. It's been running for a day now .As soon as that's done. I'll update here.

shankari commented 1 year ago

@humbleOldSage The comments here have not been addressed, this is still a draft - in the "Changes requested" state https://github.com/e-mission/e-mission-eval-private-data/issues/35 is not complete

humbleOldSage commented 11 months ago

Since this is partially tested, I'll keep the PR as draft, as soon as I have completed the final testing, I'll mark it as ready to merge.

humbleOldSage commented 11 months ago

Not tested. Needs Testing.

humbleOldSage commented 11 months ago

generate_figs_from_poster.ipynb :

plot after latest testing

Screenshot 2023-11-16 at 12 01 08 PM

snap from the research paper : Screenshot 2023-11-16 at 3 40 13 PM

plot after latest testing Screenshot 2023-11-16 at 12 01 40 PM

snaps from the research paper : Screenshot 2023-11-16 at 3 43 06 PM

humbleOldSage commented 11 months ago

generate_figs_for_poster.ipynb :

On the left are Plots after current testing, on the right are images from runs of notebook with @hlu109 custom branch :

naive fixed-width clustering from the first user's data

150m

50m

100m

DBSCAN without SVM: home cluster with a blue cluster to the south that was merged in

DBSCAN + SVM: home cluster and blue cluster to the south have been separated

humbleOldSage commented 11 months ago

Clustering_example.ipynb

Left is current test result. Right is from research paper. Suburban 50m.

Suburban 100m

Suburban 150m

Left is current result. Right is from research paper. College 50m.

College 100m

College 150m

humbleOldSage commented 11 months ago

SVM_decision_boundary.ipynb :

On the left are plots from current test, on the right are plots from old runs :

humbleOldSage commented 11 months ago

get_cluster_performance.ipynb :

For each pair, top one is the result of current test, bottom one is result from older runs :

output2

Screen Shot 2023-08-22 at 1 05 03 AM

output3

Screen Shot 2023-08-22 at 1 05 34 AM

output4

Screen Shot 2023-08-22 at 1 05 44 AM

output5

Screen Shot 2023-08-22 at 1 05 52 AM

humbleOldSage commented 11 months ago

All model results :

Screenshot 2023-11-16 at 5 53 00 PM

shankari commented 11 months ago

@humbleOldSage two more comments.

shankari commented 11 months ago

Squash-merging since this is 21 commits for some fairly simple changes. @humbleOldSage please account for this while making any future changes.