Cooperation on the Literation Review.

dichen001 commented 8 years ago

Hi @imaginationsuper,

I just summarized all the paper I read in the format of table and a document.

Two things for the moment,

Let's finish the reading for the papers I mentioned in the document. You can choose the one that I haven't read yet and supplement our Table and Doc.
Supplement related papers in Doc. Please modify the name to a similar format.[year]-[citation]-[Name].

BTW, we could find related papers mentioned in these paper, especially when they compare their work with the other.

Right now, I have read the first 5, and going to read 6.

jerry-shijieli commented 8 years ago

Great! Then I will read the 7th paper. And also I suggest we assign the 4th paper(2004----85-----Learning-to-Extract-Signature-and-Reply-Lines-from-Email) as one of the three seed papers, since it is cited a lot. And we can trace and read the 16 references on that paper.

jerry-shijieli commented 8 years ago

Summary Table for Literature Review

Year	Cite	Name	Data	Method	Results
2008	295	An analysis of active learning strategies for sequence labeling tasks	CoNLL-03 (Sang and DeMeulder, 2003) is a collection of newswire articles. NLPBA (Kim et al., 2004) is a large collection of biomedical abstracts annotated with five entities of interest. BioCreative (Yeh et al., 2005) and FlySlip (Vlachos, 2007). CORA	Sequence Labeling and CRFs; Active Learning with Sequence Models;	The large-scale empirical evaluation demonstrates that some of these newly proposed methods advance the state of the art in active learning with sequence models. These methods include information density (which we recommend in practice), sequence vote entropy, and sometimes Fisher information
2006	205	Visualizing email content: portraying relationships from conversational histories	Participants’ email archives ranged in size from 90 MB to more than one GB, with the average size being 456 MB. The time span of these archives ranged from less than one year to over nine years of email activity.	monthly and yearly words; adjusting the Time Scale; calculating the topic word(TFIDF);	Two modes of personalized email visualization: exploration of “big picture” trends and themes (“haystack”) and more detail-oriented exploration (“needle”).
2012	1	Interpreting Contact Details out of E-mail Signature Blocks	The service is available for user Gmail and Google Apps IMAP servers. Only French and English languages are fully covered but any ISO-latin signature can be analyzed	Context; Extraction of HTML Part from MIME format; Elimination of specific configurations; Language detection; Formatting Details in vCard; Standardizing Phone Numbers; Update the address books.	Millions of emails were analyzed by the servers, specific rules were adopted: non-isolatin encoding or above-200kO emails, for instance, are not analyzed for the sake of robustness.
2007	68	Author profiling for English emails	Emails in several varieties of English, including native and non-native speakers of English coming from different geographical areas.	Analysis: document parsing , text processing and linguistic analysis. Classification using WEKA toolkit of several algorithms: decision trees (J48 (Quinlan, 1993), RandomForest(Breiman, 2001)), lazy learners (IBk(Aha et al., 1991)), rule-based learners (JRip(Cohen, 1995)), Support Vector Machines (SMO(Keerthi et al., 2001), LibSVM (Chang and Lin,2001)), as well as ensemble/meta-learners (Bagging(Breiman, 1996), AdaBoostM1 (Freund andSchapire, 1996)).	Results show chosen approach works well for author profiling and that using different classifiers in combination with a subset of available features can be beneficial for predicting single traits.
2005	144	Extracting personal names from email: applying named entity recognition to informal text	CSpace email corpus (Kraut etal., 2004); Enron corpus (Klimt and Yang, 2004).	NER(named entity recognizer) and CRFs (conditional random fields)	Entity-level/F1 Mgmt-Teams (+3.9% / 91.3) Mgmt-Game (+3.8% / 95.4) Enron-Meetings (+1.2% / 77.9) Enron-Random (+0.7% / 76.7)

dichen001 / Paper-Reading

Cooperation on the Literation Review. #1

Summary Table for Literature Review