BALaka-18 / rake_new2

A Python library that enables smooth keyword extraction from any text using the RAKE(Rapid Automatic Keyword Extraction) algorithm.
MIT License
29 stars 20 forks source link

Test the current algorithm of rake_new2 to look for edge cases #9

Open BALaka-18 opened 3 years ago

BALaka-18 commented 3 years ago

Description

No algorithm can escape edge cases. Your task is to check and test for probable edge cases where you think the algorithm might fail, by trial and error. Test the library on as many texts as you can.

Read : How to use rake_new2

For example : The previous version of this algorithm couldn't handle HTML tags in text. It was resolved in the current version that you see.

NOTE : This may be a multi-assignee issue

Folder Structure, Function details

Create a folder test_cases in the root directory. The folder must contain a .txt file that will contain all the edge cases that you found, with each edge case in a separate line.

Structure : test_cases/edge_cases_file.txt

Acceptance Criteria

Definition of Done

Time Estimation

Recurring

etnnth commented 3 years ago

I'm going to start working on that if it's ok.

BALaka-18 commented 3 years ago

I'm going to start working on that if it's ok.

Assigning it to you @etnnth. Make sure the PR is created after 1st Oct.

sudhanshutiwari264 commented 3 years ago

Can I also contribute on this

BALaka-18 commented 3 years ago

Can I also contribute on this

Yes sure. Assigning you too. You can collaborate and discuss with @etnnth too.

sudhanshutiwari264 commented 3 years ago

@etnnth can we discuss it !

etnnth commented 3 years ago

@sudhanshutiwari264 yes, sure! How do you want to proceed? For now I have just take a look at the code and made it run a few time on different inputs to get a better idea of what it is doing.

sudhanshutiwari264 commented 3 years ago

@etnnth Can we talk on discord ? It will be easier for us to discuss their!

etnnth commented 3 years ago

Yes discord is fine: https://discord.com/invite/FFFHHy

BALaka-18 commented 3 years ago

Are you guys still working on this, if not I'll assign to someone else, coz my project is going in for another open source contest. So please reply at the latest.

etnnth commented 3 years ago

Sorry for the late reply. I'm not working on this for now. I blocked on it because I didn't find a reliable way to decide if something was an edge case or not. What I started however is writing simple test cases. If this is something you are interested I can add a few more and make a pull request.

BALaka-18 commented 3 years ago

@etnnth no problem i'll unassign you. Thanks for you time :)

sudhanshutiwari264 commented 3 years ago

@BALaka-18 Sorry I have been busy with the exams and internships , so unassigning this . Hope you understand !

BALaka-18 commented 3 years ago

@BALaka-18 Sorry I have been busy with the exams and internships , so unassigning this . Hope you understand !

Yes absolutely, no problem. Thanks for ur time :)

koolgax99 commented 3 years ago

@BALaka-18 can you please assign this to me?

BALaka-18 commented 3 years ago

@BALaka-18 can you please assign this to me?

Assigned.