seatgeek / fuzzywuzzy

Fuzzy String Matching in Python
http://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-string-matching-in-python/
GNU General Public License v2.0
9.23k stars 875 forks source link

partial ratio not behaving properly #193

Open LSYS opened 6 years ago

LSYS commented 6 years ago

I have a transcript (t) and a paragraph (p) taken from this transcript. I want to match a 2nd substring (s) to both p and t.

Using partial ratio, if I understood it correctly, should never return

fuzzy.partial_ratio(s, p) > fuzz.partial_ratio(s, t)

since p is simply a subset of t, but this is exactly the situation I have and I'm confused.

josegonzalez commented 6 years ago

Do you have an exact test case we can run?

LSYS commented 6 years ago

import fuzzywuzzy as fuzz

fuzz.partial_ratio(s, t) --> 59

fuzz.partial_ratio(s, p) --> 82


s = u"""this is a very high percentage and is a departure from the principle of self reliance if these patients children are also low income earners we are asking one disadvantaged group to pay for another """

p = u"""this is a very high percentage and is in fact a departure from the principle of \u201cself reliance\u201d if these patients\u2019 children are also low income earners \u2212 as is often the case \u2212 the government is merely shifting the burden of poverty within the pool of the poor """

t = u"""mr speaker thank you for giving me the opportunity to make this my maiden speech to this house in his address to parliament the president gave a broad outline of this government\u2019s goals for the next five years the government says it wants \u201cevery singaporean worker to hold a skilled well paid job every family to live in an affordable comfortable home every young person to develop himself fully and pursue his dreams every senior citizen to stay active and to live with dignity\u201d these are bold goals which my colleagues and i in the workers\u2019 party will hold the government accountable for over the next five years sir today i would like to focus on three areas that many senior citizens families and workers have pressing concerns about they are healthcare public housing and public transport mr speaker the axiom \u201cit\u2019s better to die than to fall ill in singapore\u201d has been heard time and again i believe at least twice in this debate many singaporeans especially the poor worry greatly about falling ill they are concerned not just about the painful treatment they will have to go through but more often about the high costs involved and the financial burden that may place on their already struggling children in singapore government subsidies make up only a quarter of total health expenditure out of pocket expenses employer benefits and private insurance make up most of the remainder the much vaunted \u201c3ms\u201d of medisave medishield and medifund pay for less than 10 of total healthcare expenses the lion\u2019s share of which comes from medisave which is really patients\u2019 own savings medishield is a self funding insurance scheme which members pay premiums to join these premiums rise as they grow older they also have to fork out large deductibles and co insurance before receiving pay outs and the coverage ends at age 85 the government will say that we have medifund but medifund is subject to extremely stringent means testing and disbursements are not exactly generous in 2009 an average of 1 029 was given to less than 24 000 inpatient medifund applicants this represented just 5 of total hospital admissions that year for seniors with no income and little savings the burden of healthcare is shifted to their children in 2005 60 of the elderly had their medical bills paid from their adult children\u2019s medisave accounts this is a very high percentage and is in fact a departure from the principle of \u201cself reliance\u201d if these patients\u2019 children are also low income earners \u2212 as is often the case \u2212 the government is merely shifting the burden of poverty within the pool of the poor basically we are asking one disadvantaged group to pay for another the government seems very reluctant to take on a larger financial burden for caring of our senior citizens instead it hides behind the mantras of self reliance and filial piety to justify its relatively low expenditure on healthcare for the elderly self reliance is good in principle but when a patient has exhausted his own savings and has to rely on his own struggling family members then we as a society are not being fair to both the patient and his family the ministry of health claims to provide universal health coverage to citizens but i believe we are still some way from that the world health organization defines universal health coverage as having a healthcare financing system that provides all people with access to adequate healthcare services without suffering financial hardship paying for them if we are to achieve this goal we need to expand the coverage of medishield and reduce the over reliance on direct payments by patients at the time they need the care to fund this we need to strengthen the current forms of pre payments and risk pooling and provide assistance to those who cannot afford the premiums like housewives and the elderly all these point to a need to perform major surgery on medishield mr speaker for some time now our public hospitals have been running at near full capacity with bed occupancy rates often exceeding 90 for tan tock seng hospital and over 85 for national university hospital khoo teck puat hospital which opened just last year was supposed to ease the crunch but it too has been running at almost 85 capacity for the past months the royal college of surgeons in the united kingdom has advised that bed occupancy rates above 82 put patients at an increased risk of infection it was reported in it was reported inthe it was reported inthe it was reported inthestraits times it was reported inthestraits timeson 20 august this year that hospitals in singapore are facing such a severe crunch in beds that some are \u201cborrowing\u201d space from other nearby organisations to house their patients how did we get into such a situation between the year 2000 and 2010 our population has seen an increase of 26 mostly through immigration the number of hospital admissions has seen an increase of 15 in the same period however not only has the number of hospital beds not kept pace with population growth but it has actually decreased during this period in the past decade there had been a 7 drop in the number of public sector hospital beds according to the department of statistics two years ago the then health minister admitted that on hindsight his ministry made a mistake by not building a new hospital two years earlier recently the health minister floated the idea of bringing forward the opening of sengkang hospital currently scheduled for 2020 i support this move but this is still a long time to wait and by that time our population would have increased even more what is left unanswered is why this self proclaimed \u201cfar sighted\u201d government failed in the last 10 years to build our healthcare infrastructure to keep pace with population growth and an ageing population was the government instead fixated on the near sighted goals of boosting economic growth by increasing our population mr speaker i would now like to address many singaporeans\u2019 concerns about the public housing situation in singapore in the past 10 years the hdb had grossly under supplied new housing units to the market according to figures from the hdb between 2001 and 2009 an average of just 7 700 new flats was built each year this was far short of the average annual resident household growth figure of 24 280 since 2005 even when the population surged from 2007 onwards because of the liberalisation of our immigration policies the government failed to react by building more flats for our people instead they permitted more cash rich foreigners to purchase almost any types of private property which increased their prices and pulled up the hdb flat prices since the two are linked this combination of low supply and high demand resulted in a severe housing shortage causing a sharp and sustained rise in property prices hdb resale flat prices are now 92 higher than they were 10 years ago this has not only caused much distress for many singaporean families but has also created potential asset bubble which could severely damage singapore\u2019s economy in a downturn the government finally woke from its slumber this year and ramped up the supply of build to order flats to an expected 25 000 this year and another 25 000 next year this is a move in the right direction however bto flats do not solve the immediate housing problem because it takes up to two to three years before the new flat owners get their keys in the meantime many are still without a home of their own despite the bumper launches of bto and sale of balance flats this year we still saw the third quarter hdb retail price index shoot up 3 8 over the previous quarter the cooling measures that the ministry of national development put in place earlier this year do not seem to be having their intended effects on the resale flat market the government has gone some way in reducing the housing problem for first timer couples but not for singles divorcees and those who need to downgrade to smaller flats because of financial difficulty we need to find a way to help these people who are caught in between the policies in particular more measures need to be put in place to cool down the resale hdb flat market the hdb market whether direct or resale cannot simply be left to market forces as a provider of this public good the government must step in to ensure that the welfare of its citizens comes first mr speaker please allow me to share some longstanding concerns about public transport in singapore in march this year just before the general election was announced smrt and sbs transit said they would add 590 additional mrt train trips this was expected to ease the squeeze on trains however many regular commuters will testify that the trains now seem even more crowded than ever the recently opened circle line may improve the situation nearer the city but for those commuting from the suburbs like sembawang and simei finding room to board the trains will still be a challenge one key factor that affects the train loads is the waiting time i understand that the current signalling systems on the ageing north south and east west lines allow for maximum train arrival intervals of about two minutes without compromising commuter safety but if the trains really arrive once every two minutes the overcrowding problem would not be so severe unfortunately this is seldom the case outside of the narrow window of about half an hour on weekday mornings and evenings the frequency drops to three to five minutes or even more this results in trains arriving packed with passengers making it impossible for many of those on the platform to board as a daily commuter myself i often have to wait two sometimes three trains to pass by before i can board during the morning and evening rush hours sir if the government is serious about encouraging our people to drive less and use more public transport it must give priority to tackling the overcrowding problem on trains the solution lies not only in building more lines but making better use of existing lines by increasing train frequency and maintaining that frequency for longer periods especially during peak periods why can the mrt operators not maintain a train interval of two minutes from let us say to and from to is it because of technical constraints or is it because it would increase their costs and reduce their profits under the current profit maximising model operators are incentivised to cut costs and service levels just to maintain their high margins their duopoly position in the local market reinforces this behaviour it is time for the government to demand that these operators provide a higher level of service to commuters even if it reduces their profit margins mr speaker whether in healthcare public housing or public transport the government has gone too far down the road of pursuing free market efficiency often to the detriment of the elderly and low wage workers at the time when our citizens are exposed to heightened risks in the form of global competition increased economic volatility rising inequality and wage stagnation the government is exposing them to more competition from foreigners our workers are told to be \u201ccheaper better faster\u201d more self reliant and less selective about their jobs this regressive transfer of risks from government to citizens must count as one of the pap government\u2019s biggest policy failures in the last decade the demographics social and economic changes of the 21st century demand a rethink of how much a government should provide for its people and how much we can reasonably ask our citizens to provide for themselves mr speaker we are at the dawn of a new era in the history of our nation the phrase \u201cnew normal\u201d has been used to describe this new political reality now with more workers\u2019 party members in parliament some pundits wonder if we will be a constructive or disruptive party in parliament will we help to build our country or be obsessed with tearing down our political opponents i believe this is related to a concern raised by mr speaker we are at the dawn of a new era in the history of our nation the phrase \u201cnew normal\u201d has been used to describe this new political reality now with more workers\u2019 party members in parliament some pundits wonder if we will be a constructive or disruptive party in parliament will we help to build our country or be obsessed with tearing down our political opponents i believe this is related to a concern raised bymr lee yi shyan mr speaker we are at the dawn of a new era in the history of our nation the phrase \u201cnew normal\u201d has been used to describe this new political reality now with more workers\u2019 party members in parliament some pundits wonder if we will be a constructive or disruptive party in parliament will we help to build our country or be obsessed with tearing down our political opponents i believe this is related to a concern raised bymr lee yi shyanjust now i believe our party\u2019s track record in parliament answers these questions having more workers\u2019 party mps does not change our rational and responsible approach to politics we want to be a force for good in this country \u2212 to help uncover solutions not to add to the problems however it takes two hands to clap the responsibility for ensuring fair and constructive debates in and out of this house rests not only on the opposition but also on the government i hope the debates in this house will not just be about winning the argument or scoring political points but leveraging on the arguments and counter arguments to elicit better policy outcomes this will ultimately benefit singaporeans who put us here to serve them"""
septian-putra commented 6 years ago

For that test case, I think the result is reasonable. The partial ratio calculate similarity, and because t is longer than p, so fuzzy.partial_ratio(s, p) > fuzz.partial_ratio(s, t).