Open Shell1500 opened 4 years ago
So I kind of found a way to do it myself.
def reload_soup(driver):
source = driver.page_source
soup = BeautifulSoup(source, 'html.parser')
all_soup = soup.find('div', {"id" : "main"})
soup = BeautifulSoup(str(all_soup), 'html.parser')
filtered_soup = soup.find('div', {"class" : "copyable-area"})
filtered_soup = list(filtered_soup)[2]
soup = BeautifulSoup(str(filtered_soup), 'html.parser')
final_soup = soup.findAll('div',{"class" : "copyable-text"})
info = []
for i in final_soup:
if i.has_attr('data-pre-plain-text'):
info.append(i['data-pre-plain-text'])
dates = []
names = []
for i in info:
i = i.strip()
date = i.split(']')[0][1:].strip()
name = i.split(']')[1].strip()
dates.append(date)
names.append(name.replace(':', ''))
final_soup = [text_div.text.replace('\n', ' ') for text_div in final_soup]
#print(len(dates), len(names), len(final_soup))
ss = []
for key, i in enumerate(final_soup):
if len(names) < len(final_soup):
names.append(names[(len(names)-1)])
dates.append(names[(len(dates)-1)])
print('add')
print(len(names), len(final_soup))
ss.append(names[key] + ',' + i + ',' + dates[key] + '\n')
return ss
This is the modified reload_soup
function, It uses the same extracted html to find the name and date of the message, this is slight modification that needs to be done to the print_to_console
function, as this modified function outputs a list, however that won't be a major issue.
There no way to distinguish between the sender and reciever.