recski / brise-plandok

Information extraction from text documents of the zoning plan of the City of Vienna
MIT License
6 stars 2 forks source link

potato/create_dataset fails unexpectedly if target data files exist #52

Closed recski closed 2 years ago

recski commented 2 years ago

This command works fine if data/gold* files don't exist:

python create_dataset.py -d ~/sandbox/brise-nlp/annotation/2021_09/full_data -g fourlang -o -n gold

But if I rerun it to regenerate the graphs, the same command fails with this error:

/home/recski/miniconda3/envs/brise/lib/python3.7/site-packages/pandas/core/indexing.py:845: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[key] = _infer_fill_value(value)
/home/recski/miniconda3/envs/brise/lib/python3.7/site-packages/pandas/core/indexing.py:966: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[item] = s

Emptying the data folder solves the problem.

@Eszti please have a look when you can

Eszti commented 2 years ago

These are only warnings. Calling the command for the same dataset multiple times does not have any effects, except for if the dataset has been removed.