vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.47k stars 584 forks source link

`vespa` tutorial : ./src/python/user_search.py U33527 10 KeyError: 'children' #20503

Closed raphael10-collab closed 2 years ago

raphael10-collab commented 2 years ago

I'm following step by step the Vespa tutorials: https://docs.vespa.ai/en/tutorials/news-5-recommendation.html

(vespa) raphy@pc:~/vespa/sample-apps/news$ python3 src/python/train_cold_start.py mind 10
Reading data from mind/train/behaviors.tsv
Reading data from mind/dev/behaviors.tsv
Reading data from mind/train/news.tsv
Reading data from mind/dev/news.tsv
Reading data from mind/train/news_embeddings.tsv
Reading data from mind/dev/news_embeddings.tsv
Total loss after epoch 1: 920.5850219726562 (0.7038111686706543 avg)
{'auc': 0.5391, 'mrr': 0.2367, 'ndcg@5': 0.2464, 'ndcg@10': 0.306}
{'auc': 0.5131, 'mrr': 0.2239, 'ndcg@5': 0.2296, 'ndcg@10': 0.2933}
Total loss after epoch 2: 761.771728515625 (0.5823943018913269 avg)
{'auc': 0.6469, 'mrr': 0.2991, 'ndcg@5': 0.3244, 'ndcg@10': 0.3828}
{'auc': 0.5656, 'mrr': 0.2448, 'ndcg@5': 0.2605, 'ndcg@10': 0.3257}
Total loss after epoch 3: 660.232421875 (0.5047648549079895 avg)
{'auc': 0.7031, 'mrr': 0.3325, 'ndcg@5': 0.3665, 'ndcg@10': 0.4262}
{'auc': 0.5927, 'mrr': 0.2623, 'ndcg@5': 0.2847, 'ndcg@10': 0.3472}
Total loss after epoch 4: 625.5690307617188 (0.478263795375824 avg)
{'auc': 0.7329, 'mrr': 0.3519, 'ndcg@5': 0.3901, 'ndcg@10': 0.4514}
{'auc': 0.5999, 'mrr': 0.2639, 'ndcg@5': 0.2878, 'ndcg@10': 0.3496}
Total loss after epoch 5: 605.7523803710938 (0.4631134271621704 avg)
{'auc': 0.7603, 'mrr': 0.3759, 'ndcg@5': 0.4191, 'ndcg@10': 0.4798}
{'auc': 0.6097, 'mrr': 0.2716, 'ndcg@5': 0.2966, 'ndcg@10': 0.3583}
Total loss after epoch 6: 588.047119140625 (0.44957730174064636 avg)
{'auc': 0.788, 'mrr': 0.3993, 'ndcg@5': 0.4493, 'ndcg@10': 0.5093}
{'auc': 0.6154, 'mrr': 0.2742, 'ndcg@5': 0.3005, 'ndcg@10': 0.3633}
Total loss after epoch 7: 570.1577758789062 (0.4359004497528076 avg)
{'auc': 0.8141, 'mrr': 0.4268, 'ndcg@5': 0.4835, 'ndcg@10': 0.5407}
{'auc': 0.6203, 'mrr': 0.2777, 'ndcg@5': 0.3045, 'ndcg@10': 0.3678}
Total loss after epoch 8: 551.16064453125 (0.42137664556503296 avg)
{'auc': 0.8381, 'mrr': 0.4548, 'ndcg@5': 0.5188, 'ndcg@10': 0.5737}
{'auc': 0.6225, 'mrr': 0.2802, 'ndcg@5': 0.3068, 'ndcg@10': 0.3704}
Total loss after epoch 9: 534.6995239257812 (0.4087916910648346 avg)
{'auc': 0.8578, 'mrr': 0.4789, 'ndcg@5': 0.5482, 'ndcg@10': 0.6013}
{'auc': 0.6265, 'mrr': 0.2846, 'ndcg@5': 0.3117, 'ndcg@10': 0.3747}
Total loss after epoch 10: 517.1571044921875 (0.39538004994392395 avg)
{'auc': 0.8758, 'mrr': 0.5073, 'ndcg@5': 0.5817, 'ndcg@10': 0.6315}
{'auc': 0.6246, 'mrr': 0.2843, 'ndcg@5': 0.3113, 'ndcg@10': 0.3732}
(vespa) raphy@pc:~/vespa/sample-apps/news$ 

But I'm encountering this problem:

(vespa) raphy@pc:~/vespa/sample-apps/news$ python3 src/python/convert_embeddings_to_vespa_format.py mind
Reading embeddings data from mind/user_embeddings.tsv
Reading embeddings data from mind/news_embeddings.tsv
(vespa) raphy@pc:~/vespa/sample-apps/news$ curl -s -H "Content-Type: application/json" --data \
> '{"yql" : "select * from sources user where user_id contains \"U33527\";", "hits": 1}' \
> http://localhost:8080/search/ | python -m json.tool
{
    "root": {
        "id": "toplevel",
        "relevance": 1.0,
        "fields": {
            "totalCount": 0
        },
        "coverage": {
            "coverage": 100,
            "documents": 0,
            "full": true,
            "nodes": 1,
            "results": 1,
            "resultsFull": 1
        }
    }
}
(vespa) raphy@pc:~/vespa/sample-apps/news$ ./src/python/user_search.py U33527 10
Traceback (most recent call last):
  File "./src/python/user_search.py", line 58, in <module>
    main()
  File "./src/python/user_search.py", line 51, in main
    user_vector = query_user_embedding(user_id)
  File "./src/python/user_search.py", line 21, in query_user_embedding
    embedding = parse_embedding(result["root"]["children"][0])
KeyError: 'children'
(vespa) raphy@pc:~/vespa/sample-apps/news$ 

(vespa) raphy@pc:~/vespa/sample-apps/news$ grep "U33527" mind/vespa_user_embeddings.json
{"put": "id:user:user::U33527", "fields": {"user_id":"U33527", "embedding": {"values": [0.000000,0.060903,0.158397,0.003585,0.230960,0.005171,-0.300856,-0.295116,-0.042150,-0.416067,-0.173345,-0.241960,-0.140207,-0.000399,0.463869,-0.294422,-0.080257,-0.208765,-0.070218,0.189583,0.031040,-0.073909,-0.147883,-0.164819,-0.229605,-0.248327,0.174647,-0.168265,-0.370106,-0.209611,-0.206252,-0.288447,0.091576,-0.122662,0.000394,0.172982,-0.147844,0.326629,-0.103831,-0.312612,-0.209032,0.190745,-0.335539,0.261593,0.699852,0.041234,0.241921,0.052331,0.103968,-0.216830,-0.279406]} }},

OS: Ubuntu 20.04

How to solve it ?

raphael10-collab commented 2 years ago

SOLVED (thanks to an hint given in Slack):

(vespa) raphy@pc:~/vespa/sample-apps/news$ java -jar vespa-http-client-jar-with-dependencies.jar --file mind/vespa_user_embeddings.json --endpoint http://localhost:8080
Tue Dec 14 11:27:19 CET 2021 Result received: 0 (0 failed so far, 5000 sent, success rate 0.00 docs/sec).
Tue Dec 14 11:27:21 CET 2021 Result received: 5000 (0 failed so far, 5000 sent, success rate 3362.47 docs/sec).
(vespa) raphy@pc:~/vespa/sample-apps/news$ nano mind/vespa_news_embeddings.json
(vespa) raphy@pc:~/vespa/sample-apps/news$ head mind/vespa_news_embeddings.json
[
{"update": "id:news:news::N13390", "fields": {"embedding": {"assign": { "values": [4.537634,0.176455,0.065556,0.019093,0.752897,0.068659,0.503554,0.014679,0.362952,0.199735,0.127425,0.888862,0.517692,0.899216,0.371096,0.298077,0.612331,0.175028,0.060978,0.203315,0.347929,0.666393,0.139776,0.490055,0.922364,0.315164,0.077856,0.067023,0.011687,0.273995,0.989293,0.081177,0.445051,0.043624,0.024770,0.016435,0.051732,0.643033,0.972747,0.508166,0.949796,0.093464,0.229422,0.650621,0.469585,0.693245,0.031115,0.654180,0.167369,0.610483,0.633564]} }}},
{"update": "id:news:news::N7180", "fields": {"embedding": {"assign": { "values": [4.854028,0.418532,0.980599,0.024325,0.092835,0.130468,0.110127,0.049772,0.093576,0.637094,0.033213,0.048952,0.133599,0.837817,0.025891,0.049366,0.019674,0.122829,0.100188,0.962890,0.475165,0.026999,0.092472,0.137771,0.882851,0.090943,0.010619,0.045324,0.942113,0.034402,0.490993,0.055808,0.117594,0.476196,0.448801,0.816098,0.036912,0.287460,0.883396,0.054365,0.190186,0.973147,0.135480,0.051794,0.073309,0.309725,0.060026,0.039512,0.076702,0.048420,0.255657]} }}},
{"update": "id:news:news::N20785", "fields": {"embedding": {"assign": { "values": [5.073834,0.067972,0.014702,0.333254,0.815943,0.312187,0.197460,0.060545,0.207915,0.310328,0.013294,0.198636,0.687183,0.032743,0.130325,0.131107,0.177471,0.311574,0.066050,0.935336,0.029173,0.156829,0.111487,0.284573,0.616157,0.253709,0.895951,0.019022,0.004992,0.107877,0.628005,0.071800,0.091871,0.175017,0.021793,0.009136,0.212465,0.224601,0.979112,0.381634,0.094033,0.058408,0.147255,0.180715,0.168731,0.215194,0.723256,0.179237,0.009232,0.305230,0.053397]} }}},
{"update": "id:news:news::N6937", "fields": {"embedding": {"assign": { "values": [4.417276,0.301127,0.012974,0.463084,0.564266,0.458605,0.388361,0.185326,0.320597,0.096008,0.147782,0.873207,0.683620,0.567942,0.486916,0.132167,0.851581,0.214342,0.536453,0.149083,0.039581,0.886533,0.776935,0.193094,0.377836,0.462609,0.103072,0.018507,0.010982,0.503241,0.974111,0.825754,0.721805,0.188915,0.093204,0.005576,0.188283,0.410380,0.976232,0.043870,0.934919,0.039493,0.302831,0.233829,0.302716,0.606920,0.008691,0.785021,0.112792,0.917323,0.508822]} }}},
{"update": "id:news:news::N15776", "fields": {"embedding": {"assign": { "values": [4.475263,0.773574,0.959231,0.079187,0.287883,0.626035,0.197106,0.354341,0.615190,0.783243,0.086871,0.257295,0.211249,0.760732,0.081781,0.134749,0.036826,0.801256,0.284759,0.978185,0.193252,0.271834,0.372448,0.098909,0.368470,0.273710,0.064203,0.253648,0.954332,0.113236,0.619213,0.205366,0.314757,0.577215,0.702669,0.734704,0.099124,0.736872,0.829430,0.015429,0.327150,0.978808,0.408599,0.224962,0.092252,0.445781,0.072847,0.100358,0.430305,0.183941,0.512956]} }}},
{"update": "id:news:news::N25810", "fields": {"embedding": {"assign": { "values": [5.080203,0.140029,0.022924,0.028488,0.145840,0.076443,0.474826,0.254944,0.222124,0.198149,0.202117,0.176421,0.165975,0.241384,0.606952,0.521654,0.257760,0.167004,0.178586,0.022425,0.064699,0.164531,0.127658,0.645775,0.609871,0.343775,0.120801,0.041054,0.720222,0.160948,0.943020,0.267108,0.360078,0.385207,0.016551,0.736259,0.507397,0.167205,0.330736,0.959218,0.152976,0.042024,0.303824,0.149542,0.368999,0.041691,0.040809,0.138489,0.071952,0.278245,0.126749]} }}},
{"update": "id:news:news::N20820", "fields": {"embedding": {"assign": { "values": [4.464570,0.324769,0.010672,0.088937,0.322797,0.141691,0.862767,0.625515,0.298326,0.481876,0.593258,0.668515,0.674742,0.019196,0.716556,0.831163,0.769122,0.373668,0.395639,0.001488,0.173437,0.259196,0.437724,0.330261,0.971852,0.743486,0.455003,0.003886,0.952004,0.396814,0.346307,0.645468,0.317818,0.183288,0.038867,0.530320,0.787252,0.417601,0.042894,0.992955,0.085928,0.076659,0.708430,0.366424,0.424335,0.061064,0.002889,0.328999,0.275246,0.305060,0.428642]} }}},
{"update": "id:news:news::N6885", "fields": {"embedding": {"assign": { "values": [4.541712,0.576900,0.993217,0.550350,0.544055,0.258810,0.436886,0.369377,0.468617,0.401588,0.142671,0.297778,0.341094,0.304500,0.443516,0.414711,0.524972,0.331254,0.465918,0.966455,0.507246,0.490019,0.414285,0.078451,0.029325,0.361003,0.433629,0.316037,0.068774,0.328452,0.063977,0.343406,0.505104,0.315251,0.895866,0.960758,0.246228,0.421314,0.019856,0.028228,0.016134,0.742509,0.385864,0.341553,0.506442,0.669289,0.995667,0.171480,0.373874,0.324606,0.435659]} }}},
{"update": "id:news:news::N27294", "fields": {"embedding": {"assign": { "values": [4.999808,0.182948,0.008930,0.077872,0.331501,0.058606,0.851061,0.027666,0.351978,0.048799,0.365057,0.415393,0.305936,0.598847,0.617623,0.191926,0.179350,0.218271,0.109434,0.006790,0.021324,0.154166,0.031854,0.101775,0.824676,0.124108,0.078305,0.003849,0.103536,0.139316,0.602862,0.467959,0.576522,0.053938,0.010906,0.016532,0.659504,0.100304,0.466050,0.860380,0.993305,0.749481,0.082979,0.254614,0.040648,0.136725,0.070708,0.284837,0.147887,0.239353,0.072110]} }}},
(vespa) raphy@pc:~/vespa/sample-apps/news$ 
(vespa) raphy@pc:~/vespa/sample-apps/news$ ./src/python/user_search.py U33527 10
{
  "root": {
    "id": "toplevel",
    "relevance": 1.0,
    "fields": {
      "totalCount": 28603
    },
    "coverage": {
      "coverage": 100,
      "documents": 28603,
      "full": true,
      "nodes": 1,
      "results": 1,
      "resultsFull": 1
    },
    "children": [
      {
        "id": "id:news:news::N4798",
        "relevance": 0.3778493237648661,
        "source": "mind",
        "fields": {
          "sddocname": "news",
          "documentid": "id:news:news::N4798",
          "news_id": "N4798",
          "category": "foodanddrink",
          "subcategory": "newstrends",
          "title": "PCC Community Markets Plans Its First Fast Casual Restaurant Inside New Ballard Store",
          "abstract": "The massive PCC outpost will open November 13",
          "url": "https://www.msn.com/en-us/foodanddrink/newstrends/pcc-community-markets-plans-its-first-fast-casual-restaurant-inside-new-ballard-store/ar-AAJfTtm?ocid=chopendata",
          "date": 20191109,
          "clicks": 0,
          "impressions": 0
        }
      },
      {
        "id": "id:news:news::N15773",
        "relevance": 0.3778493237648661,
        "source": "mind",
        "fields": {
          "sddocname": "news",
          "documentid": "id:news:news::N15773",
          "news_id": "N15773",
          "category": "lifestyle",
          "subcategory": "lifestyledidyouknow",
          "title": "20 Words and Phrases You Had No Idea Were Coined in New York City",
          "abstract": "Hey, youz! Check out all these vocab gems that were born in New York City! The post 20 Words and Phrases You Had No Idea Were Coined in New York City appeared first on Reader's Digest.",
          "url": "https://www.msn.com/en-us/lifestyle/lifestyledidyouknow/20-words-and-phrases-you-had-no-idea-were-coined-in-new-york-city/ss-AAHYStZ?ocid=chopendata",
          "date": 20191110,
          "clicks": 0,
          "impressions": 0
        }
      },
      {
        "id": "id:news:news::N17651",
        "relevance": 0.3778493237648661,
        "source": "mind",
        "fields": {
          "sddocname": "news",
          "documentid": "id:news:news::N17651",
          "news_id": "N17651",
          "category": "news",
          "subcategory": "newscrime",
          "title": "25-year-old killed in Arlington shooting, police say",
          "abstract": "A 25-year-old Grand Prairie man died after he was shot in Arlington on Tuesday night, Arlington police tell WFAA. Anthony Tennon was found by officers lying in the parking lot of an apartment complex on the 2100 block of Hendricks Drive with a gunshot wound around 9:10 p.m., police say. First responders took him to a local hospital, where he was later pronounced dead, according to police. \"Detectives do not believe this was a random encounter...",
          "url": "https://www.msn.com/en-us/news/newscrime/25-year-old-killed-in-arlington-shooting,-police-say/ar-AAISHcw?ocid=chopendata",
          "date": 20191101,
          "clicks": 0,
          "impressions": 0
        }
      },
      {
        "id": "id:news:news::N3775",
        "relevance": 0.3778493237648661,
        "source": "mind",
        "fields": {
          "sddocname": "news",
          "documentid": "id:news:news::N3775",
          "news_id": "N3775",
          "category": "entertainment",
          "subcategory": "celebrity",
          "title": "40 Celebrity Mothers and Daughters at the Same Age",
          "abstract": "These are some good genes.",
          "url": "https://www.msn.com/en-us/entertainment/celebrity/40-celebrity-mothers-and-daughters-at-the-same-age/ss-AAAM2R4?ocid=chopendata",
          "date": 20191112,
          "clicks": 0,
          "impressions": 0
        }
      },
      {
        "id": "id:news:news::N28518",
        "relevance": 0.3778493237648661,
        "source": "mind",
        "fields": {
          "sddocname": "news",
          "documentid": "id:news:news::N28518",
          "news_id": "N28518",
          "category": "lifestyle",
          "subcategory": "lifestylebeauty",
          "title": "46 Mismatched Nail Ideas You'll Want to Copy Immediately",
          "abstract": "Why wear one shade when you can wear them all?",
          "url": "https://www.msn.com/en-us/lifestyle/lifestylebeauty/46-mismatched-nail-ideas-you'll-want-to-copy-immediately/ss-AACJcix?ocid=chopendata",
          "date": 20191105,
          "clicks": 0,
          "impressions": 1
        }
      },
      {
        "id": "id:news:news::N3112",
        "relevance": 0.3778493237648661,
        "source": "mind",
        "fields": {
          "sddocname": "news",
          "documentid": "id:news:news::N3112",
          "news_id": "N3112",
          "category": "lifestyle",
          "subcategory": "lifestyleroyals",
          "title": "The Brands Queen Elizabeth, Prince Charles, and Prince Philip Swear By",
          "abstract": "Shop the notebooks, jackets, and more that the royals can't live without.",
          "url": "https://www.msn.com/en-us/lifestyle/lifestyleroyals/the-brands-queen-elizabeth,-prince-charles,-and-prince-philip-swear-by/ss-AAGH0ET?ocid=chopendata",
          "date": 20191107,
          "clicks": 0,
          "impressions": 0
        }
      },
      {
        "id": "id:news:news::N17110",
        "relevance": 0.3778493237648661,
        "source": "mind",
        "fields": {
          "sddocname": "news",
          "documentid": "id:news:news::N17110",
          "news_id": "N17110",
          "category": "news",
          "subcategory": "newsopinion",
          "title": "The News In Cartoons",
          "abstract": "News as seen through the eyes of the nation's editorial cartoonists.",
          "url": "https://www.msn.com/en-us/news/newsopinion/the-news-in-cartoons/ss-AABGTFJ?ocid=chopendata",
          "date": 20191108,
          "clicks": 0,
          "impressions": 0
        }
      },
      {
        "id": "id:news:news::N5452",
        "relevance": 0.3778493237648661,
        "source": "mind",
        "fields": {
          "sddocname": "news",
          "documentid": "id:news:news::N5452",
          "news_id": "N5452",
          "category": "autos",
          "subcategory": "autostrucks",
          "title": "Why Is This Weird Chevy Colorado Test Mule Trolling Ford HQ?",
          "abstract": "It could be a Bronco test vehicle wearing a Bow Tie disguise, but some things don't add up.",
          "url": "https://www.msn.com/en-us/autos/autostrucks/why-is-this-weird-chevy-colorado-test-mule-trolling-ford-hq?/ar-AAEwL70?ocid=chopendata",
          "date": 20191102,
          "clicks": 0,
          "impressions": 0
        }
      },
      {
        "id": "id:news:news::N9244",
        "relevance": 0.3778493237648661,
        "source": "mind",
        "fields": {
          "sddocname": "news",
          "documentid": "id:news:news::N9244",
          "news_id": "N9244",
          "category": "weather",
          "subcategory": "weathertopstories",
          "title": "In Photos: Flooding across the globe",
          "abstract": "Millions of people have been displaced from their homes after torrential rains triggered floods and landslides across the globe.",
          "url": "https://www.msn.com/en-us/weather/topstories/in-photos-flooding-across-the-globe/ss-AAEmMJu?ocid=chopendata",
          "date": 20191107,
          "clicks": 0,
          "impressions": 1
        }
      },
      {
        "id": "id:news:news::N4162",
        "relevance": 0.3778493237648661,
        "source": "mind",
        "fields": {
          "sddocname": "news",
          "documentid": "id:news:news::N4162",
          "news_id": "N4162",
          "category": "movies",
          "subcategory": "movies-gallery",
          "title": "Must-see biopics",
          "abstract": "Take a look at the greatest true-life biographical stories on the big screen.",
          "url": "https://www.msn.com/en-us/movies/movies-gallery/must-see-biopics/ss-AABsqwH?ocid=chopendata",
          "date": 20191104,
          "clicks": 0,
          "impressions": 0
        }
      }
    ]
  }
}