HazyResearch / evaporate

This repo contains data and code for the paper "Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes"
480 stars 45 forks source link

Issue during the train #6

Closed fenglixa closed 1 year ago

fenglixa commented 1 year ago
Average abstains across documents: 2.61
Average unique votes per document: 1.51
Abstains: True
Using device: cpu
Computing O...
Estimating \mu...
[1000 epo]: TRAIN:[loss=5.022]
[2000 epo]: TRAIN:[loss=4.516]
[3000 epo]: TRAIN:[loss=4.017]
[4000 epo]: TRAIN:[loss=3.548]
[5000 epo]: TRAIN:[loss=3.127]
[6000 epo]: TRAIN:[loss=2.761]
[7000 epo]: TRAIN:[loss=2.452]
[8000 epo]: TRAIN:[loss=2.194]
[9000 epo]: TRAIN:[loss=1.977]
[10000 epo]: TRAIN:[loss=1.792]
Finished Training
Trained Label Model Metrics (No deps):
Traceback (most recent call last):
  File "run_profiler.py", line 476, in <module>
    main()
  File "run_profiler.py", line 472, in main
    run_experiment(profiler_args)
  File "run_profiler.py", line 315, in run_experiment
    num_toks, success = run_profiler(
  File "/root/test/evaporate/src/profiler.py", line 676, in run_profiler
    file2metadata, num_toks = combine_extractions(
  File "/root/test/evaporate/src/profiler.py", line 158, in combine_extractions
    preds, used_deps, missing_files = run_ws(
  File "/root/test/evaporate/src/./weak_supervision/run_ws.py", line 219, in run_ws
    scores, preds = label_model.score(
  File "/root/test/evaporate/metal-evap/metal/classifier.py", line 134, in score
    Y_p, Y, Y_s = self._get_predictions(
  File "/root/test/evaporate/metal-evap/metal/classifier.py", line 597, in _get_predictions
    Y_pb, Y_sb = self.predict(
  File "/root/test/evaporate/metal-evap/metal/classifier.py", line 100, in predict
    Y_p = self._break_ties(Y_s, break_ties).astype(np.int)
  File "/root/anaconda3/envs/evaporate/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'int'.
`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
fenglixa commented 1 year ago

Fixed by Modify classifier.py line 100, from np.int to np.int_

# diff metal-evap/metal/classifier.py metal-evap/metal/classifier.py.org
100c100
<         Y_p = self._break_ties(Y_s, break_ties).astype(np.int_)
---
>         Y_p = self._break_ties(Y_s, break_ties).astype(np.int)
simran-arora commented 1 year ago

Hi is your issue fixed?