uthambathoju / CoronaryHeartDisease

Classfying whether a person gets coronary heart disease in the next 10 years
0 stars 0 forks source link

adas #1

Open uthambathoju opened 3 years ago

uthambathoju commented 3 years ago

data_tofit = [1.6800483 -1.641695388; 0.501309281 -0.977697538; 1.528012113 0.52771122; 1.70012253 1.711524991; 1.992493625 1.891000015; 2.706075824 -0.463427794; 2.994931927 -0.443566619; 3.491852811 -1.275179133; 3.501191722 -0.690499597; 4.459924502 -5.516130799; 4.936965851 -6.001703074; 5.023289852 -8.36416901; 5.04233698 -7.924477517; 5.50739285 -10.77482371; 5.568665171 -10.9171878]

uthambathoju commented 3 years ago

To build a machine learning solution to analyze multiple data sources, understand the impact to proactively reduce the false positives in NLP outcomes. Designed a predictive and stable model based on NLP Member/ Winner data. Leveraged historical data to model and understand diagnosis conditions impact on false positive. Applied best practices in model selection and testing to build an effective False Positive detection model to analyze the medical charts. Modeling: Used a different classification and regression models.

uthambathoju commented 3 years ago

import requests from bs4 import BeautifulSoup import numpy as np import pandas as pd import time page_url ='https://www.mwnation.com/section/chichewa/page/'

urls = ['https://www.mwnation.com/section/chichewa/'] for i in range(2, 77): urls.append(page_url + str(i) +'/')

import requests from bs4 import BeautifulSoup ignore_list = ['section','about-npl', 'imagination', 'adverts', 'rate-card', 'contact-us','wp-content', 'court-snubs-mcp-in-commissioners-case','values','our-philosophy','editorial-policy', 'advertising-policy','code-of-conduct','plagiarism-disclaimer','disclaimer','privacy-policy','terms-of-use', 'lessons-from-derek-chauvin-case','careening-dangerously-like-kabaza-motorcyclist','progress-needs-positive-reporting', 'fighting-misinformation-with-information','wp-contentuploads202103WFP-Afikepo-1-2.pdf','minister-sues-2-over-social-media-post','malawi-hunts-for-forex-in-ghana', 'chakweras-mining-dream','mcp-to-elect-vice-president-soon','court-snubs-mcp-in-commissioners-case','values']

url = 'https://www.mwnation.com/section/chichewa/page/2/'

urls_data = []

for url in urls: reqs = requests.get(url) soup = BeautifulSoup(reqs.text, 'html.parser') for link in soup.find_all('a'): flag = True if "https://www.mwnation.com/" in link.get('href'): value = link.get('href') for keyword in ignore_list: if value.find(keyword) != -1: flag = False

        if flag:
            urls_data.append(value)

urls_data = list(set(urls_data)) urls_data.remove('https://www.mwnation.com/') print(len(urls_data))

idx = 0 data_dict = {} paragraph = [] urls_data = ['https://www.mwnation.com/zokolola-zichuluka/', 'https://www.mwnation.com/boma-ligulitsa-chimanga-pamtengo-wa-sabuside/'] for url in urls_data: r1 = requests.get(url) print("0") soup = BeautifulSoup(r1, 'html.parser') print("1") for link in soup.find_all('p'): if 'Nation Publications Limited' not in link.text: print("2") title = soup.title.text text_data = link.text.strip() data_dict[idx] = paragraph.append((title, paragraph)) print("3") idx += 1 print(idx)

uthambathoju commented 3 years ago

` import requests from bs4 import BeautifulSoup import numpy as np import pandas as pd import time page_url ='https://www.mwnation.com/section/chichewa/page/'

urls = ['https://www.mwnation.com/section/chichewa/'] for i in range(2, 77): urls.append(page_url + str(i) +'/')

ignore_list = ['section','about-npl', 'imagination', 'adverts', 'rate-card', 'contact-us','wp-content', 'court-snubs-mcp-in-commissioners-case','values','our-philosophy','editorial-policy', 'advertising-policy','code-of-conduct','plagiarism-disclaimer','disclaimer','privacy-policy','terms-of-use', 'lessons-from-derek-chauvin-case','careening-dangerously-like-kabaza-motorcyclist','progress-needs-positive-reporting', 'fighting-misinformation-with-information','wp-contentuploads202103WFP-Afikepo-1-2.pdf','minister-sues-2-over-social-media-post','malawi-hunts-for-forex-in-ghana', 'chakweras-mining-dream','mcp-to-elect-vice-president-soon','court-snubs-mcp-in-commissioners-case','values']

url = 'https://www.mwnation.com/section/chichewa/page/2/'

urls_data = []

for url in urls: reqs = requests.get(url) soup = BeautifulSoup(reqs.text, 'html.parser') for link in soup.find_all('a'): flag = True if "https://www.mwnation.com/" in link.get('href'): value = link.get('href') for keyword in ignore_list: if value.find(keyword) != -1: flag = False

        if flag:
            urls_data.append(value)

urls_data = list(set(urls_data)) urls_data.remove('https://www.mwnation.com/') print(len(urls_data))

idx = 0 data_dict = {} paragraph = [] urls_data = ['https://www.mwnation.com/zokolola-zichuluka/', 'https://www.mwnation.com/boma-ligulitsa-chimanga-pamtengo-wa-sabuside/'] for url in urls_data: r1 = requests.get(url) print("0") soup = BeautifulSoup(r1, 'html.parser') print("1") for link in soup.find_all('p'): if 'Nation Publications Limited' not in link.text: print("2") title = soup.title.text text_data = link.text.strip() data_dict[idx] = paragraph.append((title, paragraph)) print("3") idx += 1 print(idx) `