strohne / Facepager

Facepager was made for fetching public available data from YouTube, Twitter and other websites on the basis of APIs and webscraping.
https://github.com/strohne/Facepager/releases
506 stars 198 forks source link

CSV structure occasionally broken #40

Closed nikicc closed 8 years ago

nikicc commented 8 years ago

Some FB posts cause exporting to CSV to break. By break I mean that each post is not in its own line any more since some posts are broken over multiple lines. For example, this is printscreen of a post online (coming from here, scroll down to 23rd May 2014)

screen shot 2016-06-01 at 21 22 06

The resulting CSV is broken into multiple lines like this:

"594";"4";"1";"98358327274_10152537848242275";"data";"fetched (200)";"2016-06-01 21:07:13.278147";"Facebook:<Object ID>/posts";"PIRATSKA STRANKA SLOVENIJE";"Posnetek včerajšnjega TV soočenja na RTV1 na katerem smo sodelovali tudi mi in na katerem je bil naš kandidat, po mnenju mnogih, nesporni zmagovalec.

Prepričajte se sami. Priporočamo tudi ogled včerajšnje aktivnosti na twitterju pod hashtagom #EUVolitve
https://twitter.com/search?q=%23euvolitve
";"51";"6";"13";"2014-05-23T07:34:08+0000"

This prevents Excell to import the data correctly. Maybe we should escape newlines from messages when exporting to CSV?

strohne commented 8 years ago

interesting issue. i have no problem with line breaks when opening the file with double click. the line breaks break indeed when opening from inside excel.

in future version we should have an export option to delete line breaks.

workaround until then: a) open with double click or b) open in texteditor and replace all line breaks not preceded by a quotation mark

nikicc commented 8 years ago

No worries, I already fixed it in Python. I just wanted to give a heads-up about the issue.

Also, thanks for a prompt reply and it's great to hear this will be fixed in a future version!

strohne commented 8 years ago

thanks for commenting. maybe you could create a pull request?

nikicc commented 8 years ago

I can try if I'll find some time to set the environment up 😄