abjer / sds2019

Social Data Science 2019 - a summer school course
https://abjer.github.io/sds2019
46 stars 96 forks source link

Working With Lists In DataFrames #36

Open IAmAndreasSK opened 4 years ago

IAmAndreasSK commented 4 years ago

Hi,

My group and I use the hashtags #MakeAmericaGreatAgain and #ImWithHer as a basis for our project. However, we would also like to see the other hashtags which the tweets have been using. To that end, we wrote a code that would insert a list of all hashtags used under a "All Hashtags" column for all tweets.

hu=[]
for i in range(len(data["results"])):
    ho=[]
    for d in data['results'][i]['entities']['hashtags']:
        ho.append(str(d["text"]))
    hu.append(ho)

df["Hashtag"].copy()[0]=hu[0]

We were wondering the following:

  1. How can we easily count the number of times the different hashtags have been used? We tried df["Hashtag"].value_counts() but that counts the number of times specific lists occur rather than the elements in them. I guess we could do a loop but I'd hope for a more elegant solution.

  2. Is there a way to write the general code in a more 'smooth' way? And should we even use lists in the way we have done?

Thank you!

jrkkfst commented 4 years ago

I personally would use a dictionary. See this: https://stackoverflow.com/questions/3496518/python-using-a-dictionary-to-count-the-items-in-a-list

Or, as posted in the link, use a Counter funciton: https://docs.python.org/2/library/collections.html#collections.Counter