yoonkim / CNN_sentence

CNNs for sentence classification
2.05k stars 826 forks source link

question about clearn_str in process_data.py #42

Open zhililab opened 6 years ago

zhililab commented 6 years ago

Hi @yoonkim Recently, I am reading dennybritz/cnn-text-classification-tf implementation based on your original code I found there some regrex pattern in the function clean_str(string)such as


    string = re.sub(r"\'d", " \'d", string)
    string = re.sub(r"\'ll", " \'ll", string)
    string = re.sub(r",", " , ", string)
    string = re.sub(r"!", " ! ", string)
    string = re.sub(r"\(", " \( ", string)

when program find this kind of patterns, it just replaces found pattern with the same thing from my viewpoint. So my question is that whats the purpose of those re.sub code. Im confused. Could u give me some clues? Thx a lot ; )

rzsgrt commented 4 years ago

for first line string = re.sub(r"\'d", " \'d", string) it will separate word i'd to i 'd