Beit-Hatfutsot / mojp-dbs-pipelines

pipelines for data sync of Jewish data sources to the DB of the muesum of the Jewish people
MIT License
0 stars 2 forks source link

[task] research how we should handle items with missing title in a language or no title at all #18

Open OriHoch opened 7 years ago

OriHoch commented 7 years ago

in old code there is this test, but I guess it's not relevant once we have better control over the data

need to think about it and the implications of it..

    # update_es function sets item without header to _
    # so this is how an item with missing hebrew header will look like in ES
    assert PERSONALITY_WITH_MISSING_HE_HEADER_AND_SLUG["Header"] == {'En': 'Davydov, Karl Yulyevich', 'He': '_'}
    # these items will also no have a slug
    assert PERSONALITY_WITH_MISSING_HE_HEADER_AND_SLUG["Slug"] == {'En': 'luminary_davydov-karl-yulyevich'}
    # search for these items
    result = list(assert_search_results(client.get(u"/v1/search?q=karl+yulyevich"), 1))[0]["_source"]
    assert result["Header"] == {'En': 'Davydov, Karl Yulyevich', 'He': '_',
                                "En_lc": 'Davydov, Karl Yulyevich'.lower(), "He_lc": "_"}
    assert result["Slug"] == {'En': 'luminary_davydov-karl-yulyevich'}
nuritgazit commented 7 years ago

How many of them do we have? If the number is small for manual correction, I think we should hand a list over to Haim for his team to take care of that. As for future cases - it won't be possible to add a new item / edit an existing one without it having a title.