loklak / loklak_server

Distributed Open Source twitter and social media message search server that anonymously collects, shares, dumps and indexes data http://api.loklak.org
GNU Lesser General Public License v2.1
1.38k stars 223 forks source link

Discrepancy in 'text' and 'mentions' data returned by api.loklak for @user #1602

Open simsausaurabh opened 6 years ago

simsausaurabh commented 6 years ago

Short description

There is a discrepancy in text and mentions data returned by api.loklak.

query: @wansapanahannah

returned result: text : 31. i love @MarieliciousVIP since day one. @MarieliciousVIP mentions : ["MarieliciousVIP", "MarieliciousVIP"] images : [] images_count : 0 mentions_count : 2

actual tweet contains: text : 31. i love @MarieliciousVIP since day one. mentions : @MarieliciousVIP (only one mention) It contains one image and one mention.

Link to tweet: https://twitter.com/wansapanahannah/status/998446061982175233

Output log for this particular status:

"provider_type": "SCRAPED",
      "audio_count": 0,
      "location_source": "ANNOTATION",
      "hashtags": [],
      "hashtags_count": 0,
      "favourites_count": 0,
      "link": "https://twitter.com/wansapanahannah/status/998446061982175233",
      "created_at": "2018-05-21T06:11:06.000Z",
      "videos": [],
      "mentions_count": 2,
      "classifier_emotion_probability": 2.7371401E-8,
      "without_lu_len": 59,
      "text_length": 59,
      "retweet_count": 0,
      "unshorten": {},
      "without_l_len": 59,
      "screen_name": "wansapanahannah",
      "id_str": "998446061982175233",
      "location_point": [
        -75.14675154124701,
        9.243909748850342
      ],
      "links_count": 0,
      "links": [],
      "videos_count": 0,
      "text": "31. i love @MarieliciousVIP since day one. @MarieliciousVIP",
      "audio": [],
      "place_id": "",
      "timestamp": "2018-05-21T06:16:10.897Z",
      "classifier_language_probability": 4.3254377E-7,
      "timestamp_id": 1526883368775,
      "place_name": "Sincé",
      "images": [],
      "without_luh_len": 59,
      "classifier_language": "english",
      "hosts": [],
      "place_country": "Colombia",
      "images_count": 0,
      "source_type": "TWITTER",
      "hosts_count": 0,
      "place_country_code": "CO",
      "place_country_center": [
        -40.850280817062725,
        8.790974835302208
      ],
      "place_context": "ABOUT",
      "location_mark": [
        -75.14575075920571,
        9.242148368511025
      ],
      "classifier_emotion": "joy",
      "mentions": [
        "MarieliciousVIP",
        "MarieliciousVIP"
      ],
      "user": {
        "appearance_first": "2018-05-21T06:16:08.784Z",
        "profile_image_url_https": "https://pbs.twimg.com/profile_images/998411636137328641/GnCw3pbH_bigger.jpg",
        "screen_name": "wansapanahannah",
        "user_id": "801041259519295489",
        "name": "madammm.💫",
        "appearance_latest": "2018-05-21T06:16:08.784Z"
      },
      "location_radius": 0
    }

Environment

Steps to reproduce

  1. Search for @wansapanahannah in api.loklak and look for the result of the tweet status.

Expected behaviour

It should return the exact no of mentions and the original text of status.

Actual behaviour

Currently there is some discrepancy in the result returned. This is main problem related to the issue: #673

simsausaurabh commented 6 years ago

@singhpratyush @sudheesh001 please provide your reviews about it.

singhpratyush commented 6 years ago

If the issue is on the server, you should open it there. I think it should be quite easy to fix.

simsausaurabh commented 6 years ago

Yes, this issue is on the server part only, so I have created this issue here(on server). Yes I am looking into it for fixing.

singhpratyush commented 6 years ago

Oh! I'm sorry. I confused this with the loklak search repo. Please continue with the issue and post in public chat if you face any issues. Thanks.

simsausaurabh commented 6 years ago

@singhpratyush I am working on it :+1:

sudheesh001 commented 6 years ago

Is this still reproducible for you? I am unable to see such behavior on my local instance.

simsausaurabh commented 6 years ago

@sudheesh001 Yes, I am able to reproduce it. Please see the screenshot below: query: from:simsausaurabh Actual source status: https://twitter.com/simsausaurabh/status/1000331996348997632

screenshot from 2018-06-01 11-30-29

Result:

{
      "provider_type": "SCRAPED",
      "audio_count": 0,
      "classifier_profanity_probability": 2.932073E-33,
      "hashtags": [],
      "hashtags_count": 0,
      "favourites_count": 15,
      "link": "https://twitter.com/simsausaurabh/status/1000331996348997632",
      "created_at": "2018-05-26T11:05:07.000Z",
      "videos": [],
      "mentions_count": 19,
      "classifier_emotion_probability": 4.0653516E-29,
      "without_lu_len": 331,
      "text_length": 331,
      "retweet_count": 3,
      "unshorten": {},
      "without_l_len": 331,
      "screen_name": "simsausaurabh",
      "id_str": "1000331996348997632",
      "links_count": 0,
      "links": [],
      "videos_count": 0,
      "text": "Great meetup organised today! Thank you everyone for coming. I gave a brief introduction about Open Source and @fossasia, and how to start contributing for @gsoc with @fossasia. @mariobehling @hpdang @0rb1t3r @mielamvn @loklak_ @faevent @eventyay @fossasia @gsoc @mariobehling @hpdang @0rb1t3r @mielamvn @loklak_ @faevent @eventyay",
      "audio": [],
      "place_id": "",
      "timestamp": "2018-06-01T06:03:18.985Z",
      "classifier_language_probability": 1.1823488E-27,
      "timestamp_id": 1527832998890,
      "place_name": "",
      "images": [
        "https://pbs.twimg.com/media/DeHjpqXXcAAoZHq.jpg",
        "https://pbs.twimg.com/media/DeHjfAXWkAEYjML.jpg",
        "https://pbs.twimg.com/media/DeHj205WAAEzif4.jpg",
        "https://pbs.twimg.com/media/DeHj-0aW4AEQ251.jpg"
      ],
      "without_luh_len": 331,
      "classifier_language": "english",
      "hosts": [],
      "images_count": 4,
      "source_type": "TWITTER",
      "hosts_count": 0,
      "place_context": "ABOUT",
      "classifier_emotion": "joy",
      "classifier_profanity": "swear",
      "mentions": [
        "fossasia",
        "gsoc",
        "fossasia",
        "mariobehling",
        "hpdang",
        "0rb1t3r",
        "mielamvn",
        "loklak_",
        "faevent",
        "eventyay",
        "fossasia",
        "gsoc",
        "mariobehling",
        "hpdang",
        "0rb1t3r",
        "mielamvn",
        "loklak_",
        "faevent",
        "eventyay"
      ],
      "user": {
        "appearance_first": "2018-06-01T06:03:18.892Z",
        "profile_image_url_https": "https://pbs.twimg.com/profile_images/899700314369957888/Md7JSIr6_bigger.jpg",
        "screen_name": "simsausaurabh",
        "user_id": "713763581964251136",
        "name": "Saurabh Srivastava",
        "appearance_latest": "2018-06-01T06:03:18.892Z"
      }
    }
rmartinus commented 5 years ago

I would like to work on the fix if this is still an issue.

simsausaurabh commented 5 years ago

@rmartinus Yes you can work on it, it is still an issue.