digitalmethodsinitiative / dmi-tcat

Digital Methods Initiative - Twitter Capture and Analysis Toolset
Apache License 2.0
367 stars 114 forks source link

Analysis: exclude [RT] - File with \[RT_\] cannot be downloaded #27

Closed natoinet closed 10 years ago

natoinet commented 10 years ago

Hello Erik!

We're having a lot of fun with TCat in Barcelona :-)

For big datasets, we wanted to exclude the retweets, so we introduced [RT ] in exclude. The file is generated correctly on the analysis/cache folder on the server. However we cannot download it from a browser since it contains 2 slash before the opening and closing brackets "[RT_]".

The generated url is /analysis/cache/Barcelona-20140602-20140603--/[RT_/]----tweetStats--26060f3a66.csv When we click on the download link, a 404 error occurs.

I attached a picture of what was introduced in the analysis UI for a Tweet stats analysis. screen shot 2014-06-04 at 10 53 51

Best!

ErikBorra commented 10 years ago

Hi there,

good to hear you're enjoying DMI-TCAT :) The error has to do with the / in the URL. Until I have time to do a decent fix, you can replace all / in the URL by %22.

Note that [RT ] will also exclude tweets with text which ends in 'rt' followed by a space, e.g. 'start this'. You can avoid this by excluding RT @ instead.

Best,

Erik

ErikBorra commented 10 years ago

We have resolved this issue in commit https://github.com/digitalmethodsinitiative/dmi-tcat/commit/b811ad9c35e93423c40f054dad905b6c0b1d4fcc Also check our other new features!