ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.18k stars 1.19k forks source link

Predictions: CSV not being generated and OSError: [Errno 22] Invalid argument: results\class_probabilities_"smaller than sign" UNK "bigger than sign".npy #1521

Closed maragato2000 closed 3 weeks ago

maragato2000 commented 2 years ago

I think i might not be the first one on this but as i see no activity on the other posts this might make it visible and a part of the message does not show on their posts for others ....

Concerns a simple text classification model as the one describe in the intro. Model seems to be created correctely but when asking to predict after some running it throughs an error OSError: [Errno 22] Invalid argument: 'results\classprobabilities.npy'

the exact command I am executing >>>> (ludwigenv) C:\Users\GT>ludwig predict --dataset C:/GT/TestKyri/Jan2018AplicarMod.csv --model_path C:/Users/GT/results/experiment_run/model

the config.yaml file describing the network model >>>>{input_features: [{name: doc_text, type: text}], output_features: [{name: class, type: category}]}

logs.zip most importantly, the log file from a failing run >>>>

Evaluation: 100%| C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\sklearn\metrics_classification.py:1308: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use zero_division parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result)) C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\sklearn\metrics_classification.py:1308: UndefinedMetricWarning: Recall is ill-defined and being set to 0.0 in labels with no true samples. Use zero_division parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result)) Traceback (most recent call last): File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\runpy.py", line 87, in run_code exec(code, run_globals) File "C:\Users\GT\Anaconda3\envs\ludwigenv\Scripts\ludwig.exe_main.py", line 7, in File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\ludwig\cli.py", line 136, in main CLI() File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\ludwig\cli.py", line 71, in init getattr(self, args.command)() File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\ludwig\cli.py", line 83, in evaluate evaluate.cli(sys.argv[2:]) File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\ludwig\evaluate.py", line 295, in cli evaluate_cli(**vars(args)) File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\ludwig\evaluate.py", line 122, in evaluate_cli model.evaluate( File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\ludwig\api.py", line 871, in evaluate postproc_predictions = postprocess( File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\ludwig\data\postprocessing.py", line 56, in postprocess _save_as_numpy(predictions, output_directory, saved_keys) File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\ludwig\data\postprocessing.py", line 69, in save_as_numpy np.save(npy_filename.format(k), v) File "", line 5, in save File "C:\Users\GT\Anaconda3\envs\ludwigenv\lib\site-packages\numpy\lib\npyio.py", line 525, in save file_ctx = open(file, "wb") OSError: [Errno 22] Invalid argument: 'results\classprobabilities"smaller than sign" UNK "bigger than sign".npy'

!! i have added "smaller than sign" UNK "bigger than sign".npy as original message will not show it here in the posts (neither on the persons having this same issue.

Environment ):

Additional context Add any other context about the problem here.

tgaddair commented 2 years ago

Thanks for reporting @maragato2000, to clarify, is the true filename something like this?

'results\class_probabilities<UNK>.npy'

I'm wondering if this has to do with having special characters in the path name.

maragato2000 commented 2 years ago

Yes indeed, results\class_probabilities.npy with no in the filename.

By the way, strangely enough (and besides inside the results directory) i find the same files under system32 and also on the directory just before results...

Le ven. 3 déc. 2021 à 18:12, Travis Addair @.***> a écrit :

Thanks for reporting @maragato2000 https://github.com/maragato2000, to clarify, is the true filename something like this?

'results\class_probabilities.npy'

I'm wondering if this has to do with having special characters in the path name.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ludwig-ai/ludwig/issues/1521#issuecomment-985688204, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL5OUM3RKQM3A5UJUNTL3DTUPD3AVANCNFSM5I3BC73A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

tgaddair commented 2 years ago

Sounds like this may be an issue with the way special characters are handled on Windows. Do you have a small repro script we can run? We can try to setup Windows tests and see if it produces the same error.

carlogrisetti commented 2 years ago

Can reproduce this on Ludwig 0.4.1 on Windows. Issue is exactly the special characters not being allowed in Windows. Other files use the square brackets [] (legal) instead of angled brackets <> (illegal)

carlogrisetti commented 2 years ago

Using [UNK] instead of as the unknown padding is a workaround for this issue on Windows. It can also be done even when the model has already been trained, by replacing with [UNK] in training_set_metadata.json