ProjectSidewalk / sidewalk-quality-analysis

An analysis of Project Sidewalk user quality based on interaction logs
5 stars 3 forks source link

Python script to run classifier for core #53

Open nch0w opened 5 years ago

nch0w commented 5 years ago

I added classifer.py, which is a start to a script that can run the user quality regressor on Core.

To run it, just do

import classifier
clf = classifier.UserQualityRegressor()
clf.fit('ml-label-correctness-one-mission.csv', 'sidewalk-seattle-label_point.csv', 'ml-users.csv')
clf.predict_one_user('user-interaction-logs.csv', 'ml-label-correctness-one-mission.csv', 'sidewalk-seattle-label_point.csv') 

Prediction doesn't work yet, maybe Tyler can work on fixing it.

daotyl000 commented 5 years ago

'user-interaction-logs.csv' is a place holder for the name of the file that holds all of the interaction data of the user you want to make a prediction of.

I also added another variable to be intered in the predict_one_user function, the file name for a csv containing the number of panos viewed by each user to be used for features that is normalized by panos viewed.

I am continuing to work on the script to get prediction working