whiletrue-industries / odds

MIT License
1 stars 0 forks source link

usecase focus: extract Ontario article(s) claims to validate #35

Closed noamoss closed 1 month ago

noamoss commented 1 year ago

limited to the following topics:

  1. Find relevant stories of these topics
  2. For each story, export claims to verify credibility by official local datasets.
  3. verify manually results.
noamoss commented 1 year ago


[place name] = [government level] =

I want to validate claims about [place name] based on official datasets at the [government level] level.

Please analyze the following article and export all the claims that can be verified for credibility based on official datasets. Make sure to provide a brief explanation of each claim's credibility and reference any relevant official sources if available, for [Place name] [government level] only. Show results in a validated json format.

Article Title: [Title] Author: [Author name] Publication Date: [Date]

Article Text: [Article Text]


  1. Review the article carefully and identify any claims made within the text.
  2. For each claim, determine whether it can be verified based on official datasets or government sources of [place name] [level] only.
  3. Export the verifiable claims in a list format along with a brief explanation of each claim's credibility and the official dataset/source it can be validated against, if applicable.

Example Claim:

{"quote": "When the unemployment went low by two percent during the previous quarter...",
 "claim": "The unemployment rate decreased by 2% in the last quarter.",
"credibility explanation": "this claim is verifiable through official labor market reports published by the [Name of the Country]'s Department of Labor Statistics for the corresponding period."}


noamoss commented 1 year ago



    "quote": "Approximately 55,000 education workers have walked off the job in Canada’s most populous province, after the Ontario government passed legislation this week imposing contracts on them and banning strikes.",
    "claim": "Approximately 55,000 education workers are on strike in Ontario.",
    "credibility explanation": "The claim is verifiable through official records and reports from the Ontario government or relevant labor authorities that track the number of education workers currently on strike in the province."
    "quote": "Ontario Premier Doug Ford’s right-wing government passed Bill 28, the Keeping Students in Class Act, on Thursday afternoon while invoking a contentious clause of Canada’s constitution to preempt court challenges.",
    "claim": "The Ontario government passed Bill 28, the Keeping Students in Class Act.",
    "credibility explanation": "The claim is verifiable through official legislative records and publications from the Ontario government or related government bodies, which document the passing of Bill 28."
    "quote": "The so-called “notwithstanding clause” allows provinces to suspend certain portions of the constitution – the Canadian Charter of Rights and Freedoms – for a five-year period.",
    "claim": "The notwithstanding clause allows provinces to suspend certain portions of the Canadian Charter of Rights and Freedoms for a five-year period.",
    "credibility explanation": "The claim can be verified through official constitutional documents and records from the Government of Canada, which outline the provisions and scope of the notwithstanding clause in the Canadian Constitution."
    "quote": "The Canadian Union of Public Employees (CUPE), which counts 55,000 custodians, maintenance and library workers, secretaries and other education support staff who are affected by Bill 28, called the legislation an attack on all workers’ bargaining rights and staged a strike, anyway.",
    "claim": "The Canadian Union of Public Employees (CUPE) represents 55,000 education support staff affected by Bill 28.",
    "credibility explanation": "The claim can be verified through official records and membership data from the Canadian Union of Public Employees (CUPE) or related labor authorities, which should provide information on the number of education support staff they represent."
    "quote": "The 55,000 members of CUPE’s Ontario School Boards Council of Unions (OSBCU) … who are working in publicly-funded schools across Ontario are the backbone of Ontario’s public education system.",
    "claim": "CUPE's Ontario School Boards Council of Unions (OSBCU) has 55,000 members working in publicly-funded schools across Ontario.",
    "credibility explanation": "The claim can be verified through official membership records and data from CUPE's Ontario School Boards Council of Unions (OSBCU) or relevant educational authorities in Ontario."
    "quote": "They are also the lowest-paid education workers, earning, on average, only [$28,900] $39,000 [Canadian] a year which has left many on the brink of poverty.",
    "claim": "Education workers in CUPE are among the lowest-paid, with average annual earnings of $28,900 to $39,000.",
    "credibility explanation": "The claim can be verified through official salary data and reports from the Ontario government, labor departments, or CUPE, which should provide information on the average earnings of education workers represented by the union."
    "quote": "The four-year contract imposed on workers includes raises of 1.5 to 2.5 percent – far lower than the union demanded in order to meet surging costs of living.",
    "claim": "The imposed four-year contract includes raises of 1.5 to 2.5 percent for workers.",
    "credibility explanation": "The claim can be verified through official contract documents and statements from the Ontario government, Ministry of Education, or labor relations boards, which should provide details of the raises specified in the contract imposed on the education workers."
    "quote": "Bill 28 also includes a daily $2,968 (4,000 Canadian dollars) fine for striking workers, which the union has said it will fight or pay, if needed.",
    "claim": "Bill 28 includes a daily fine of $2,968 (CAD 4,000) for striking workers.",
    "credibility explanation": "The claim can be verified through official legislative records, bill texts, or statements from the Ontario government or relevant legal authorities, which should detail the fines and penalties specified in Bill 28."
    "quote": "This is only the second time the notwithstanding clause has been used in Ontario’s history, and both times Ford was the one who wielded it.",
    "claim": "The notwithstanding clause has been used twice in Ontario's history, and both times were during Doug Ford's tenure.",
    "credibility explanation": "The claim can be verified through official historical records and documents from the Government of Ontario or related government sources, which should provide information on the instances when the notwithstanding clause was invoked in the province and who was responsible for wielding it."
    "quote": "Noa Mendelsohn Aviv, executive director of the Canadian Civil Liberties Association, described the passage of Bill 28 in Ontario this week as “horrifying”.",
    "claim": "Noa Mendelsohn Aviv, executive director of the Canadian Civil Liberties Association, expressed concern over the passage of Bill 28.",
    "credibility explanation": "The claim can be verified through official statements, press releases, or public communications from the Canadian Civil Liberties Association or related organizations, which should provide evidence of Noa Mendelsohn Aviv's reaction to the passage of Bill 28."
noamoss commented 1 year ago

Source: https://farmersforum.com/canada-sets-corn-yield-record-ontario-sets-winter-wheat-record/


    "quote": "Canada’s 2022 crop season produced a record corn harvest and third-largest wheat harvest.",
    "claim": "Canada had a record corn harvest and the third-largest wheat harvest in 2022.",
    "credibility explanation": "This claim can be verified through official data from Statistics Canada's crop production reports for the year 2022."
    "quote": "Ontario saw a winter wheat yield record for both soft red and soft white which cracked 100 bu/ac for the first time.",
    "claim": "Ontario achieved a winter wheat yield record of over 100 bushels per acre for both soft red and soft white varieties in 2022.",
    "credibility explanation": "This claim can be verified through official data from Statistics Canada's crop production reports specifically for Ontario's winter wheat yields in 2022."
    "quote": "Ontario corn and soybean yields were healthy but dipped below Ontario’s 2021 record yields.",
    "claim": "Ontario's corn and soybean yields in 2022 were healthy but lower than the record yields achieved in 2021.",
    "credibility explanation": "This claim can be verified through official data from Statistics Canada's crop production reports specifically for Ontario's corn and soybean yields in both 2021 and 2022."
    "quote": "Canadian grain corn production rose 4 % in 2022 to a record 14.5 million tonnes grown on 3.6 million acres, according to StatCan. Corn yields edged up 0.2 bushels per acre to 160.4 bu/ac.",
    "claim": "Canadian grain corn production increased by 4% in 2022, reaching a record 14.5 million tonnes, with an average yield of 160.4 bushels per acre.",
    "credibility explanation": "This claim can be verified through official data from Statistics Canada's crop production reports for the year 2022."
    "quote": "Ontario’s average corn yields did better than that, at 166 bu/ac, though down 5.3 % from 2021 because of drier conditions in the southwest.",
    "claim": "Ontario's average corn yield in 2022 was 166 bushels per acre, a decrease of 5.3% from the previous year due to drier conditions in the southwest region.",
    "credibility explanation": "This claim can be verified through official data from Statistics Canada's crop production reports for Ontario's corn yields in both 2021 and 2022."
    "quote": "Ontario soybean acres also went up 5.2 % in 2022 but production still fell 2.1 % to 4-million tonnes. The increased acreage couldn’t fully offset a 7 % decline in soybean yields to 48 bu/ac.",
    "claim": "Ontario's soybean acres increased by 5.2% in 2022, but production decreased by 2.1% to 4 million tonnes. The increased acreage couldn't fully offset a 7% decline in soybean yields to 48 bushels per acre.",
    "credibility explanation": "This claim can be verified through official data from Statistics Canada's crop production reports for Ontario's soybean production and yields in 2022."
    "quote": "Ontario produced less winter wheat in 2022 than in 2021 — with just over 2.2 million tonnes harvested from 848,000 acres in 2022. The 2021 wheat crop weighed in at nearly 2.7 million tonnes on just over 1 million acres. Though a smaller crop than a year earlier, record yield performance was the story of Ontario’s 2022 winter wheat: StatCan recorded a whopping 96.5 bu/ac average for all three winter wheat types. Agricorp also reported record-range yields.",
    "claim": "Ontario's winter wheat production in 2022 was lower than in 2021, with just over 2.2 million tonnes harvested from 848,000 acres. However, the 2022 winter wheat achieved a record yield performance of 96.5 bushels per acre across all three wheat types.",
    "credibility explanation": "This claim can be verified through official data from Statistics Canada's crop production reports for Ontario's winter wheat production and yields in both 2021 and 2022."
noamoss commented 1 year ago

source: https://www.cbc.ca/news/canada/toronto/developer-greenbelt-pickering-1.6658896


    "quote": "Much of the land the developer had purchased was located in what is now known as the Duffins Rouge Agricultural Preserve (DRAP), and was protected by easements and a zoning order. In 2005, former premier Dalton McGuinty's Liberal government also included the preserve lands in what it called the Greenbelt — a vast 810,000-hectare area of farmland, forest and wetland stretching from Niagara Falls to Peterborough — marking it as off limits to development.",
    "claim": "The Duffins Rouge Agricultural Preserve (DRAP) was included in the Greenbelt, making it off-limits to development.",
    "credibility explanation": "This claim is credible and can be verified through official records and documents published by the Ontario government, particularly related to the creation of the Greenbelt and the inclusion of DRAP in it."
    "quote": "De Gasperis told the National Post in March 2005 that the province's move to include the DRAP in the Greenbelt would cost his company an estimated $240 million in lost revenue.",
    "claim": "De Gasperis estimated that including DRAP in the Greenbelt would result in $240 million in lost revenue for his company.",
    "credibility explanation": "This claim is based on a statement made by De Gasperis to the National Post. While it is not possible to directly verify the estimated financial impact, the statement's existence can be confirmed through the National Post's archives."
    "quote": "De Gasperis then launched a campaign to stymy plans for the Greenbelt, working with Pickering to develop the preserve land anyway, and eventually took the province to court. His efforts failed and the agricultural land has remained protected, with De Gasperis unable to build new subdivisions as he originally planned.",
    "claim": "De Gasperis attempted to develop the preserve land despite it being part of the Greenbelt, and his efforts were unsuccessful.",
    "credibility explanation": "This claim is based on the information provided in the article and can be verified through court records and official statements from the Ontario government regarding the protection of Greenbelt lands."
    "quote": "But that could soon change thanks to the Ford government's proposal to free up thousands of hectares of Greenbelt land in 15 areas of the province, including the DRAP. This could pave the way for housing and millions in development profits for landowners, including De Gasperis.",
    "claim": "The Ford government proposed to free up thousands of hectares of Greenbelt land, including the DRAP, for potential housing development.",
    "credibility explanation": "This claim is credible and can be verified through official statements and press releases issued by the Ontario government regarding their proposed changes to the Greenbelt and the inclusion of DRAP in it."
    "quote": "A CBC Toronto analysis of property and corporate records has identified 28 properties covering 718 hectares that are owned by companies that list De Gasperis and his brothers, Carlo and Michael De Gasperis, as directors. The seven different companies paid more than $21.5 million for the lots, 24 of which were purchased before the Greenbelt was created.",
    "claim": "CBC Toronto's analysis identified 28 properties covering 718 hectares owned by companies associated with the De Gasperis family, including properties purchased before the Greenbelt's creation.",
    "credibility explanation": "This claim is based on CBC Toronto's analysis of property and corporate records, which can be accessed and verified for accuracy by referring to their published reports and sources."
    "quote": "An aerial view of a portion of the Duffins Creek watershed near the border of Pickering, Ont., and Ajax, Ont., east of Toronto. (Toronto and Region Conservation Authority)",
    "claim": "The Duffins Creek watershed is located near the border of Pickering and Ajax, east of Toronto.",
    "credibility explanation": "This claim is credible and can be verified by referring to official maps and documents from the Toronto and Region Conservation Authority or other government agencies responsible for managing watersheds and natural resources."
    "quote": "In a statement last week, the conservation authority said the province's proposal involving the DRAP could result in 'large-scale unplanned urbanization' and negatively impact nearby watersheds.",
    "claim": "The conservation authority expressed concerns about the potential negative impacts of the province's proposal on nearby watersheds and the possibility of 'large-scale unplanned urbanization.'",
    "credibility explanation": "This claim is credible and can be verified through the official statement issued by the Toronto and Region Conservation Authority (TRCA) regarding the potential impacts of the proposed changes to DRAP and its surrounding areas."
    "quote": "If the Greenbelt land swap goes forward, the Ford government has said it expects landowners to prepare housing plans quickly, with construction to begin no later than 2025. Otherwise, the land will revert to its previous protected status.",
    "claim": "The Ford government expects landowners to prepare housing plans quickly and begin construction by 2025 if the Greenbelt land swap is implemented.",
    "credibility explanation": "This claim is credible and can be verified through official government statements and documents related to the proposed changes to the Greenbelt and the timeline for development set by the Ford government."
    "quote": "Pickering Mayor Kevin Ashe said the local council has been asking for development on the DRAP for 20 years, arguing its inclusion in the Greenbelt was based on 'political science,' not 'real science.' He thinks the province's tight timeline can be met.",
    "claim": "Pickering Mayor Kevin Ashe has been advocating for development on the DRAP for 20 years, and he believes the province's timeline for development can be met.",
    "credibility explanation": "This claim is based on Mayor Kevin Ashe's statements and public positions regarding development on the DRAP. While the duration of his advocacy cannot be independently verified, his statements and beliefs are a matter of public record."
    "quote": "The De Gasperis brothers and their companies — Tacc Developments, Tacc Construction, Arista Homes and Opus Homes — have donated at least $163,362 to the Progressive Conservative Party of Ontario and its politicians since 2014, the last date for which data is available online.",
    "claim": "The De Gasperis brothers and their companies have donated at least $163,362 to the Progressive Conservative Party of Ontario and its politicians since 2014.",
    "credibility explanation": "This claim is verifiable through official records published by Elections Ontario, which keep track of political donations made to political parties and their politicians."
    "quote": "The data shows they also donated at least $84,413 to the Ontario Liberal Party when it was in government and $27,981 to the Ontario NDP, the data shows.",
    "claim": "The De Gasperis brothers and their companies donated at least $84,413 to the Ontario Liberal Party and $27,981 to the Ontario NDP.",
    "credibility explanation": "This claim is verifiable through official records published by Elections Ontario, which keep track of political donations made to political parties and their politicians."
    "quote": "Other family members appear to have donated tens of thousands more.",
    "claim": "Other family members made additional political donations.",
    "credibility explanation": "This claim is based on the information provided in the article, but without specific names or amounts, it cannot be independently verified. The credibility of this claim relies on further research to identify the individuals and the amounts of their political donations."
noamoss commented 1 year ago

source: https://www.thestar.com/opinion/commentary/2015/05/21/ontarios-greenbelt-encourages-smarter-urban-growth.html


"quote": "In 2005, former premier Dalton McGuinty's Liberal government also included the preserve lands in what it called the Greenbelt — a vast 810,000-hectare area of farmland, forest and wetland stretching from Niagara Falls to Peterborough — marking it as off limits to development.",
"claim": "The preserve lands in Ontario were included in the Greenbelt by Dalton McGuinty's Liberal government in 2005.",
"credibility explanation": "This claim is credible. The inclusion of the preserve lands in the Greenbelt by the McGuinty government can be verified through official government records, such as legislative documents and announcements."
"quote": "The province moved to repeal the DRAP Act after receiving requests from the former and current mayor of Pickering.",
"claim": "The province has proposed repealing the Duffins Rouge Agricultural Preserve (DRAP) Act.",
"credibility explanation": "This claim is credible. The proposal to repeal the DRAP Act can be verified through official government sources, such as legislative records and announcements."
"quote": "If the Greenbelt land swap goes forward, the Ford government has said it expects landowners to prepare housing plans quickly, with construction to begin no later than 2025. Otherwise, the land will revert to its previous protected status.",
"claim": "The Ford government expects landowners to prepare housing plans and begin construction on Greenbelt land by 2025 if the proposed land swap goes forward.",
"credibility explanation": "This claim is credible. The expectation set by the Ford government regarding housing plans and construction timelines can be verified through official government statements, policy documents, or proposed legislation."
"quote": "The De Gasperis brothers and their companies have donated at least $163,362 to the Progressive Conservative Party of Ontario and its politicians since 2014.",
"claim": "The De Gasperis brothers and their companies have made significant political donations to the Progressive Conservative Party of Ontario since 2014.",
"credibility explanation": "This claim is credible. The political donations made by the De Gasperis brothers and their companies can be verified through official records maintained by Elections Ontario."
"quote": "De Gasperis family companies have also employed at least three lobbyists with close ties to the PC government.",
"claim": "De Gasperis family companies have employed lobbyists with close ties to the Progressive Conservative government.",
"credibility explanation": "This claim is credible. The employment of lobbyists by De Gasperis family companies can be verified through records kept by the Office of the Integrity Commissioner."
"quote": "A coalition of 200 groups and individuals has penned a letter calling out the Ontario government for its plans to have tens of thousands of homes located in the province's Greenbelt.",
"claim": "A coalition of 200 groups and individuals has criticized the Ontario government's plans for housing development in the Greenbelt.",
"credibility explanation": "This claim is credible. The existence of a coalition and their criticism of the government's plans can be verified through public statements, news reports, or official communications from the coalition."
noamoss commented 1 year ago

source: https://www.therecord.com/news/council/north-dumfries-headed-to-ontario-land-tribunal-over-alps-road-gravel-pit-application/article_3fc5bf38-1ece-5f19-ba2f-46a7fc0865b1.html?


    "quote": "The Township of North Dumfries is headed to the Ontario Land Tribunal after failing to make a decision on a proposed pit on Alps Road in 2020.",
    "claim": "The Township of North Dumfries is involved in a case with the Ontario Land Tribunal regarding a proposed pit on Alps Road since 2020.",
    "credibility explanation": "This claim can be considered credible as it is mentioned in the article and pertains to a specific legal case involving the Township of North Dumfries and the Ontario Land Tribunal."
    "quote": "The applicant, a numbered company, appealed to the tribunal after council did not decide on its zone change application for 1662 Alps Rd. within 90 days of receiving a complete application, as required by legislation at the time.",
    "claim": "The applicant, a numbered company, appealed to the Ontario Land Tribunal due to the council's failure to make a decision on its zone change application within the required timeframe.",
    "credibility explanation": "This claim can be considered credible as it is mentioned in the article and relates to a legal process involving the applicant and the Ontario Land Tribunal."
    "quote": "“Council has not as of yet taken a position on the zone change application or the parallel (Aggregate Resource Act) licence application,” chief administrative officer Andrew McNeely said.",
    "claim": "The North Dumfries council has not made a decision on the zone change application or the Aggregate Resource Act licence application yet.",
    "credibility explanation": "This claim can be considered credible as it is a direct quote from the chief administrative officer Andrew McNeely, indicating the council's current stance on the applications."
    "quote": "The pit is known as the Chudyk Pit.",
    "claim": "The proposed pit on Alps Road is referred to as the Chudyk Pit.",
    "credibility explanation": "This claim can be considered credible as it is mentioned in the article and provides a name for the proposed pit on Alps Road."
    "quote": "The property is sandwiched between three existing pits — the Lafarge Cambridge Pit to the west, Alps Pit to the east and the Cedar Creek pit to the north.",
    "claim": "The property for the proposed Chudyk Pit is located between three existing pits: Lafarge Cambridge Pit (west), Alps Pit (east), and Cedar Creek Pit (north).",
    "credibility explanation": "This claim can be considered credible as it is mentioned in the article and provides a description of the location of the proposed Chudyk Pit in relation to existing pits."
    "quote": "It’s about 40.4 hectares in size and fronts onto Alps Road, midway between Dumfries Road and Edworthy Side Road. About 31 hectares would be used for extraction.",
    "claim": "The proposed Chudyk Pit is approximately 40.4 hectares in size, with about 31 hectares intended for extraction, and it is located on Alps Road between Dumfries Road and Edworthy Side Road.",
    "credibility explanation": "This claim can be considered credible as it is mentioned in the article and provides specific details about the size and location of the proposed Chudyk Pit."
    "quote": "It’s estimated there are about five million tonnes of aggregate, 1.5 metres above the groundwater table, at the Chudyk site.",
    "claim": "The estimated amount of aggregate at the Chudyk Pit site is approximately five million tonnes, located 1.5 meters above the groundwater table.",
    "credibility explanation": "This claim can be considered credible as it is mentioned in the article and provides an estimate of the amount of aggregate at the Chudyk Pit site."

**notice the attached output given by the model:***

Note: The article does not contain specific claims that can be validated with official datasets at the county level in Ontario, Canada. The claims are mainly related to the legal case, the status of applications, and specific details about the proposed pit and its location, all of which are provided in the article and can be considered credible based on the information presented.

noamoss commented 1 year ago

source: https://theconversation.com/ontarios-new-child-welfare-policy-is-promising-but-youth-leaving-care-need-more-support-202437


    "quote": "During the COVID-19 pandemic, Ontario’s Ministry of Children, Community and Social Services placed a moratorium on its child welfare policy that requires youth to leave foster care and group homes once they turn 18.",
    "claim": "Ontario's Ministry of Children, Community and Social Services placed a moratorium on its child welfare policy that requires youth to leave foster care and group homes at age 18 during the COVID-19 pandemic.",
    "credibility explanation": "This claim can be verified through official statements or announcements made by the Ontario Ministry of Children, Community and Social Services or related government reports addressing child welfare policies during the COVID-19 pandemic."
    "quote": "The ministry’s commitment to a child welfare redesign is welcome and timely.",
    "claim": "The Ontario Ministry of Children, Community and Social Services has committed to a child welfare redesign.",
    "credibility explanation": "This claim can be verified through official statements or publications by the Ontario Ministry of Children, Community and Social Services or related government sources announcing the commitment to a child welfare redesign."
    "quote": "The new policy will replace two policy directives: Continued Care and Support for Youth (CCSY) and Supporting Consistency of Care for Youth Whose Arrangements are Scheduled to Expire During the COVID-19 Pandemic.",
    "claim": "The new child welfare policy in Ontario will replace two policy directives: Continued Care and Support for Youth (CCSY) and Supporting Consistency of Care for Youth Whose Arrangements are Scheduled to Expire During the COVID-19 Pandemic.",
    "credibility explanation": "This claim can be verified through official documents or statements from the Ontario Ministry of Children, Community and Social Services or related government publications announcing the replacement of the two policy directives with the new child welfare policy."
    "quote": "Youth will now begin transition planning on their 13th birthday. Between the ages of 13 and 18, planning will focus on health, education, identity, family and social relationships, emotional and behavioural development and self-care skills.",
    "claim": "Youth in Ontario will now begin transition planning for leaving care on their 13th birthday, with planning focusing on various aspects of their well-being until they turn 18.",
    "credibility explanation": "This claim can be verified through official documents or statements from the Ontario Ministry of Children, Community and Social Services or related government sources outlining the new child welfare policy and its provision for transition planning starting at age 13."
    "quote": "At age 18, most youth will still be required to leave their care placements. However, they will be eligible to receive transition supports, like those available under CCSY, until they turn 23.",
    "claim": "At age 18, most youth in Ontario will still be required to leave their care placements, but they will be eligible to receive transition supports until they turn 23.",
    "credibility explanation": "This claim can be verified through official documents or statements from the Ontario Ministry of Children, Community and Social Services or related government sources that outline the age at which youth are required to leave care placements and the eligibility for transition supports until age 23 under the new child welfare policy."
    "quote": "The new policy also includes the Ready, Set, Go Program which begins targeted transition planning when youth turn 16.",
    "claim": "The new child welfare policy in Ontario includes the Ready, Set, Go Program, which initiates targeted transition planning when youth turn 16.",
    "credibility explanation": "This claim can be verified through official documents or statements from the Ontario Ministry of Children, Community and Social Services or related government sources outlining the inclusion of the Ready, Set, Go Program as part of the new child welfare policy."
noamoss commented 1 year ago

@akariv @pwalsh @OriHoch Ontario stories and claims are ready for your review.

akariv commented 1 year ago

Okay, I tried to tackle the first item ("The unemployment rate decreased by 2% in the last quarter").

Using our new API, https://ckangpt-api.uumpa.xyz/docs#/default/find_datasets__find_datasets_get, we search for I tried searching for various terms. The best results were for "unemployment" and "unemployment percent". This leads me to a conclusion (needs more evidence), that the next step, after receiving the claim, would be to convert it to a 'search query': "describe the sort of information that you need in order to verify this claim, in two words or less".

For example:

Here is a list of claims. For each claim, describe in 5 words or less the sort of information or data that would be needed in order to verify the claim:
- The unemployment rate decreased by 2% in the last quarter
- Approximately 55,000 education workers are on strike in Ontario
- The Ontario government passed Bill 28, the Keeping Students in Class Act
- The notwithstanding clause allows provinces to suspend certain portions of the Canadian Charter of Rights and Freedoms for a five-year period
- The Canadian Union of Public Employees (CUPE) represents 55,000 education support staff affected by Bill 28
- Canada had a record corn harvest and the third-largest wheat harvest in 2022
- Canadian grain corn production increased by 4% in 2022, reaching a record 14.5 million tonnes, with an average yield of 160.4 bushels per acre

- Official unemployment rate data
- Documentation or news reports
- Official government records or news reports
- Canadian Constitution and related documentation
- CUPE membership and affected staff
- Official agricultural production statistics
akariv commented 1 year ago

(btw, searching for "Official unemployment rate data" retrieves the correct dataset as the most relevant one which is super cool :)

Screen Shot 2023-07-20 at 21 24 15
noamoss commented 1 year ago

@akariv really nice. I thought the idea behind the embedding vectorization is to skip the need of defining/guessing a query. What am I missing?

akariv commented 1 year ago

The way I see it is that we're describing all the datasets by asking GPT to tell us "what's in the dataset". Then, when we get a claim, we need to convert it to something that looks similar to these dataset descriptions, so embedding similarity works best.

We can do it by asking "what needs to be in the dataset that would help you verify it" - which conceptually is a little different from the raw claim - and results in this "dataset query" which plays nice with the dataset descriptions.

Once we get the list of dataset candidates, we can go back to the query and ask GPT which dataset is the most relevant for the original claim.

@OriHoch I think this should probably be done as a preprocessing step, before querying pinecone - wdyt?

OriHoch commented 1 year ago

we have such a step - https://github.com/whiletrue-industries/ckangpt/blob/main/ckangpt/frontend/get_vector_db_query.py

akariv commented 12 months ago

Adding from @noamoss and mine conversation: