Sandrine2016 / BGC-NASA-landslide-detection

Detect landslide location, date, and event on Reddit
MIT License
0 stars 1 forks source link

lid.176.bin has wrong file format #8

Open bgc-autumn opened 2 years ago

bgc-autumn commented 2 years ago

Executing main.py successfully queries Pushshift, but eventually fails with the error "ValueError: /src/../models/lid.176.bin has wrong file format!". This happens in a local environment, or in a locally built container. Please attempt to build and execute this container from a clean clone of the repo!

Run log with stack trace as follows:

INFO:root:done
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 46 concurrent workers.
[Parallel(n_jobs=-1)]: Done   8 out of  24 | elapsed:    2.7s remaining:    5.3s
[Parallel(n_jobs=-1)]: Done  24 out of  24 | elapsed:    4.8s finished
Warning : `load_model` does not return WordVectorModel or SupervisedModel any more, but a `FastText` object which is very similar.
Traceback (most recent call last):
  File "//src/main.py", line 33, in <module>
    main()
  File "//src/main.py", line 13, in main
    articles_df = data.get_articles_df(start_date, end_date)
  File "/src/data/data.py", line 28, in get_articles_df
    df = articles.filter_invalid_articles(df)
  File "/src/data/articles.py", line 143, in filter_invalid_articles
    df = filter_articles_by_lang(df)
  File "/src/data/articles.py", line 108, in filter_articles_by_lang
    lang_model = fasttext.load_model(os.path.join(config.model_path, "lid.176.bin"))
  File "/usr/local/lib/python3.9/site-packages/fasttext/FastText.py", line 441, in load_model
    return _FastText(model_path=path)
  File "/usr/local/lib/python3.9/site-packages/fasttext/FastText.py", line 98, in __init__
    self.f.loadModel(model_path)
ValueError: /src/../models/lid.176.bin has wrong file format!
LiamNiisan commented 2 years ago

The problem seems to be due to git-lfs not being installed, thank you for catching that. An installation section has been added to the README file with more detailed instructions on how to run the code.