Closed 0rC0 closed 5 years ago
i'm going to address the issue, by making a script that generates a csv with the same characteristics of the data-set you have used, and revert the format to the original one
could you please describe the exact format (column names and types) associate with the dataset you used?
also, sorry for messing up :/
The code in the notebook referes to the file in the first commit. Maybe @forloopkilla (the author of the first commit) has it?
I can try anyway in the next days to adapt the code in the notebook to the format in the published dataset
Also @vecna might help on clarifying what the format of the dataset was? @0rC0 I'm afraid the format of the published dataset would need some modification to work (for example if it needs to tokenize the texts, they cant be a list and that's why they're concatenated).
In commit 4e14481 FB_topics.ipynb the pandas.DataFrame columns referred in code are not anymore present, probably they where in the not publicy available
electiondata_topics.csv
referred in commit ab179b1. https://github.com/tracking-exposed/dashboard/blob/ab179b15756fecddfcb174182ede1104ce2d7794/FB_topics.ipynb#L116For example, in
probably from eu19 dataset there is no column concatenatedText https://github.com/tracking-exposed/dashboard/blob/4e1448115bdaa1be6611fce5b9e3c850fe5eb267/FB_topics.ipynb#L144
There are also other issues, i.e. the language in
user_a.csv
is Spanish and not English: https://github.com/tracking-exposed/dashboard/blob/4e1448115bdaa1be6611fce5b9e3c850fe5eb267/FB_topics.ipynb#L78
Hey guys, sorry for the late reply. The data I have subsetted is all in English. I can upload the data 'codes/fb/electiondata_topics.csv' if you guys find that useful. Due to some privacy issues, is it best to email you the data or upload it here?
Hey guys, sorry for the late reply. The data I have subsetted is all in English. I can upload the data 'codes/fb/electiondata_topics.csv' if you guys find that useful. Due to some privacy issues, is it best to email you the data or upload it here?
@forloopkilla I think you could just paste the column names here and i will figure how how to convert the current csv ouputs to one that is readable by your script :)
correct, the contatenatexText
was something unique made for the datathon. The text it is not concatenated, normally, but given as a list (https://eu19.tracking.exposed/page/api/ look at texts
returned from summary
)
@forloopkilla
Hey guys, sorry for the late reply. The data I have subsetted is all in English. I can upload the data 'codes/fb/electiondata_topics.csv' if you guys find that useful. Due to some privacy issues, is it best to email you the data or upload it here?
hey, no! you can't share that dataset (you should have deleted it after the datathon) and the solution, is to align the code to the actual format
Hey guys, sorry for the late reply. The data I have subsetted is all in English. I can upload the data 'codes/fb/electiondata_topics.csv' if you guys find that useful. Due to some privacy issues, is it best to email you the data or upload it here?
@forloopkilla I think you could just paste the column names here and i will figure how how to convert the current csv ouputs to one that is readable by your script :)
COLUMN NAMES: ['ANGRY', 'HAHA', 'LIKE', 'LOVE', 'SAD', 'WOW', 'displaySource', 'fblinktype', 'id', 'images.count', 'impressionOrder', 'impressionTime', 'nature', 'permaLink', 'postId', 'publicationTime', 'source', 'sourceLink', 'timeline', 'user', 'concatenatedText', 'concatLanguage']
@0rC0 I'm afraid the format of the published dataset would need some modification to work (for example if it needs to tokenize the texts, they cant be a list and that's why they're concatenated).
Is this ConcatenatedText
something like ' '.join(texts)
or do I miss something?
I wanted to put the "hands on" the code :P and I'm trying to play with the DataFrame columns to make columns in the old format. If it can interest someone: https://github.com/0rC0/dashboard/commit/a66ed302566c09b40fe8c76705324cec64bf169f
@0rC0 yes, is that.
merged! thanks!
In commit 4e1448115bdaa1be6611fce5b9e3c850fe5eb267 FB_topics.ipynb the pandas.DataFrame columns referred in code are not anymore present, probably they where in the not publicy available
electiondata_topics.csv
referred in commit ab179b15756fecddfcb174182ede1104ce2d7794. https://github.com/tracking-exposed/dashboard/blob/ab179b15756fecddfcb174182ede1104ce2d7794/FB_topics.ipynb#L116For example, in https://github.com/tracking-exposed/dashboard/blob/4e1448115bdaa1be6611fce5b9e3c850fe5eb267/FB_topics.ipynb#L119 probably from eu19 dataset there is no column concatenatedText https://github.com/tracking-exposed/dashboard/blob/4e1448115bdaa1be6611fce5b9e3c850fe5eb267/FB_topics.ipynb#L144
There are also other issues, i.e. the language in
user_a.csv
is Spanish and not English: https://github.com/tracking-exposed/dashboard/blob/4e1448115bdaa1be6611fce5b9e3c850fe5eb267/FB_topics.ipynb#L78