christiebarron / aiTextDetector

A project aimed at identifying when a written text was generated by an AI.
5 stars 3 forks source link

Feature engineering: such as Lexical Features, Syntactic Features, Stylistic Features and Semantic Features, etc. #5

Open birdipa2 opened 1 year ago

birdipa2 commented 1 year ago

Lexical Features:

Average word length Vocabulary richness (e.g., type-token ratio) Frequency distribution of words, bigrams, and trigrams Frequency of stop words Frequency of rare words or unique terms

Syntactic Features:

Average sentence length Sentence complexity (e.g., number of clauses per sentence) Frequency of different part-of-speech (POS) tags Dependency parsing patterns Stylistic Features:

Frequency of punctuation marks Use of passive voice Readability scores

Semantic Features:

Sentiment analysis scores