fani-lab / Osprey

Online Predatory Conversation Detection
0 stars 0 forks source link

Random Forest Baseline #15

Open hosseinfani opened 1 year ago

hosseinfani commented 1 year ago

The issue is created to add the random forest classifier to classify a message in a conversation to be predatory or normal.

We need to implement train, test, eval, main, only depending on the saved result of each other. Sample codes can be found in:

For coding the main of each baseline: https://github.com/fani-lab/OpeNTF/blob/148c1c2defe1176563f162ad159b2ffe0af15ecc/src/mdl/ntf.py#L60

For coding the eval function of each baseline: https://github.com/fani-lab/OpeNTF/blob/148c1c2defe1176563f162ad159b2ffe0af15ecc/src/mdl/ntf.py#L18

For predictions and evaluation results in dataframe (panda) and also applying mean: https://github.com/fani-lab/OpeNTF/blob/main/src/eval/metric.py

For file management: https://github.com/fani-lab/learning_to_refine_query/tree/main/output/l2cr

EhsanSl commented 1 year ago

That's awesome, Im going to go through them I appreciate it!

EhsanSl commented 1 year ago

Hi Dr. Hossein the reason I'm taking your time is that I tried to run the project once on my computer and there were some packages that needed to be installed, and I added the list of them in README.md in case someone do not have them already. The last issue I had was about a file that is present but cannot be accessed. would you please look into it?

issue1

issue2

Thanks in Advance

hosseinfani commented 1 year ago

@EhsanSl Check the working directory when you run the code. It should be ./src

I believe you can solve it urself.

EhsanSl commented 1 year ago

That's a Good point! Thanks!

EhsanSl commented 1 year ago

Hi Dr.Hossein Feel free to ignore all those commits for README.md, I finally learned to use the requirements.txt to import the packages. sorry about that!

EhsanSl commented 1 year ago

Hi Dr. Hossein, I hope you are having a great day! After I finally ran the code and went through the files and tried to familiarize myself in general, I realized I am going to need to go through a LinkedIn course again (some part of it) and make sure I understand it better ( applied-machine-learning-algorithms) . also, I found a handout that practices object-oriented python on the Codecademy website and I am hoping it helps me get started on coding faster. if you don't mind, I should go through these first and educate myself a bit more. I would really appreciate it if you would share any newbie-friendly resources as well.

hosseinfani commented 1 year ago

@EhsanSl Thanks for the update. What is your estimate on starting coding the first classifier? I thought you already have some experience on training ML models.

EhsanSl commented 1 year ago

yes, I did some model creation but the code was written by me and one of my classmates, so the codes were much simpler :( can I give a more accurate date in a few days? I'd like to stay on my promise, that's why!

EhsanSl commented 1 year ago

Hi Dr. Hossein, I hope everything is going well, I reviewed the course material and also went through the oop with Python hand-out and tried to familiarize myself with the relation between our classes more. Tried some coding as well; since i did not want to go too deep just yet, i tried to implement the RandomForest inside baseline just for now. i tried a few things here and there but it seems like have problems passing all the features and labels( targets) to the base line ( from the errors i get). I was wondering if you would take a look at what I tried to do so far and maybe explain to me what is it that I don't catch.