brunoamaral / gregory-ai

Artificial Intelligence and Machine Learning to help find scientific research and filter relevant content
https://gregory-ai.com/
Other
47 stars 7 forks source link

Admin container can't run training for the Machine Learning models #108

Open brunoamaral opened 2 years ago

brunoamaral commented 2 years ago

1_data_processor.py:

>>> dataset["summary"] = dataset["summary"].apply(html.unescape)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/local/lib/python3.10/site-packages/pandas/core/series.py", line 4433, in apply
    return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
  File "/usr/local/lib/python3.10/site-packages/pandas/core/apply.py", line 1082, in apply
    return self.apply_standard()
  File "/usr/local/lib/python3.10/site-packages/pandas/core/apply.py", line 1137, in apply_standard
    mapped = lib.map_infer(
  File "pandas/_libs/lib.pyx", line 2870, in pandas._libs.lib.map_infer
  File "/usr/local/lib/python3.10/html/__init__.py", line 130, in unescape
    if '&' not in s:
TypeError: argument of type 'NoneType' is not iterable
brunoamaral commented 2 years ago

Pushing this up in the roadmap, because it would be nice to have the ML Model update itself.

brunoamaral commented 1 year ago

This issue is over a year old but is still relevant.

Been looking into it now and then but never made any progress trying to increase the docker resources. Maybe it's a host limitation ?

Steps to train the ML models:

  1. docker exec -it admin ./manage.py 1_data_processor
  2. docker exec -it admin ./manage.py 2_train_models

After which the command returns killed. For reference, we are running on a Digital Ocean droplet with 2 vCPU, 4 GB Memory.

Any ideas?