hermitdave / FrequencyWords

Repository for Frequency Word List Generator and processed files
MIT License
1.18k stars 556 forks source link

to CSV convertor #4

Closed ziaenezhad closed 2 years ago

ziaenezhad commented 7 years ago

Hi Hermit, thanks for sharing your great project. this piece of code converts all .txt file in the "./content" directory to .csv in their same path. it maybe helps for importing words into databases. it's written in the nodejs. usage: https://github.com/sajjad-shirazy/FrequencyWords/tree/master/src/to-csv-convertor

hugolpz commented 5 years ago

A bash and sed, find, basename command could do same without node dependencies. This converts to csv :

cat pl_50k.txt | sed -E 's/ /,/g' | sed '1 i\word,occurences' > pl_50k.csv

I think the following would do (light doubt on the ./content/2016/**/*.txt part, not tested):

for file in ./content/2016/**/*.txt ;do echo $file ;cat pl_50k.txt | sed -E 's/ /,/g' | sed '1 i\word,occurences' > ./content/2016/csv/`basename $file .txt`.csv; done;

Note: the current sed regex may broke words if they contain space themselves.

DGrothe-PhD commented 4 years ago

As there is no "sed" for windows, here's a VBS for Windows: there's a VBS script doing in place of sed. https://stackoverflow.com/questions/127318/is-there-any-sed-like-utility-for-cmd-exe#6028937 cscript replace.vbs "pl_50k.txt" " " "," For inserting the header just insert one extra line in the vbs: (...) Set objFile = objFSO.OpenTextFile(strFileName, ForWriting) objFile.Write "Word,Occurrences"+vbnewline (inserted) objFile.Write strNewText objFile.Close