Sales-Choice-Volunteering-Project / EmotionAnalyzerWeka

The program for obtaining emotion data
1 stars 0 forks source link

Figure out how to ensure Arff file gives text quotations #26

Closed sherlockliang888 closed 3 years ago

sherlockliang888 commented 3 years ago

Arff needs to have quotations around text strings, but there are instances where quotes can't be added automatically. Figure out how to modify this with Python, or other tools.

For example, -positive,"According to the company 's updated strategy for the years 2009-2012 , Basware targets a long-term net sales growth in the range of 20 % -40 % with an operating profit margin of 10 % -20 % of net sales ." -positive,FINANCING OF ASPOCOMP 'S GROWTH Aspocomp is aggressively pursuing its growth strategy by increasingly focusing on technologically more demanding HDI printed circuit boards PCBs . -positive,"For the last quarter of 2010 , Componenta 's net sales doubled to EUR131m from EUR76m for the same period a year earlier , while it moved to a zero pre-tax profit from a pre-tax loss of EUR7m ."

Instance2 has no quotes

sherlockliang888 commented 3 years ago

Read csv into df, add comma after labels, and quotations marks around text, then save as txt instead of csv file. The following Python script worked after testing.

""" df= pd.read_csv("financial_news_headlines_sentiment.csv", encoding="ISO-8859-1", names = ["class", "text"]) df['text'] = df['text'].apply(lambda x: '"' + str(x) + '"') df['class'] = df['class'].apply(lambda x: str(x) + ",") np.savetxt(r'finance.txt', df.values, fmt='%s',encoding='utf-8')

"""

Done