facebookresearch / ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
https://parl.ai
MIT License
10.49k stars 2.1k forks source link

Error: File being used by another process when downloading datasets #3311

Closed Auxodevio closed 3 years ago

Auxodevio commented 3 years ago

Bug description When downloading datasets (babi.tar.gz and personachat.tgz) I encounter the following error on Windows 10: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'E:\thesis\ParlAI\data\Persona-Chat\personachat.tgz'

Reproduction steps Running parlai display_data -t babi:task1k:1 produces: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'E:\thesis\ParlAI\data\bAbI\babi.tar.gz'

Running parlai train_model -t personachat -m transformer/ranker -mf /tmp/model_tr6 produces: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'E:\thesis\ParlAI\data\Persona-Chat\personachat.tgz'

Expected behavior The console output indicates that the file was successfully downloaded in both cases. I am not accessing the file using any other process.

Logs Please paste the command line output:

PS E:\thesis> parlai display_data -t babi:task1k:1
11:52:33 | Opt:
11:52:33 |     allow_missing_init_opts: False
11:52:33 |     batchsize: 1
11:52:33 |     datapath: E:\thesis\ParlAI\data
11:52:33 |     datatype: train:ordered
11:52:33 |     dict_class: None
11:52:33 |     display_add_fields:
11:52:33 |     download_path: None
11:52:33 |     dynamic_batching: None
11:52:33 |     hide_labels: False
11:52:33 |     ignore_agent_reply: True
11:52:33 |     image_cropsize: 224
11:52:33 |     image_mode: raw
11:52:33 |     image_size: 256
11:52:33 |     init_model: None
11:52:33 |     init_opt: None
11:52:33 |     loglevel: info
11:52:33 |     max_display_len: 1000
11:52:33 |     model: None
11:52:33 |     model_file: None
11:52:33 |     multitask_weights: [1]
11:52:33 |     num_examples: 10
11:52:33 |     override: "{'task': 'babi:task1k:1'}"
11:52:33 |     parlai_home: E:\thesis\ParlAI
11:52:33 |     starttime: Dec16_11-52
11:52:33 |     task: babi:task1k:1
11:52:33 |     verbose: False
11:52:33 | Current ParlAI commit: f2c15b0689a299468e9a31ac21f7fbf4c56e2b2a
11:52:33 | creating task(s): babi:task1k:1
[building data: E:\thesis\ParlAI\data\bAbI]
11:52:33 | Downloading http://parl.ai/downloads/babi/babi.tar.gz to E:\thesis\ParlAI\data\bAbI\babi.tar.gz
Downloading babi.tar.gz: 0.00B [00:00, ?B/s]
Traceback (most recent call last):
  File "e:\thesis\parlai\parlai\core\worlds.py", line 1212, in _create_task_agents
    task_agents = my_module.create_agents(opt)  # type: ignore
AttributeError: module 'parlai.tasks.babi.agents' has no attribute 'create_agents'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\Scripts\parlai-script.py", line 33, in <module>
    sys.exit(load_entry_point('parlai', 'console_scripts', 'parlai')())
  File "e:\thesis\parlai\parlai\__main__.py", line 14, in main
    superscript_main()
  File "e:\thesis\parlai\parlai\core\script.py", line 307, in superscript_main
    return SCRIPT_REGISTRY[cmd].klass._run_from_parser_and_opt(opt, parser)
  File "e:\thesis\parlai\parlai\core\script.py", line 90, in _run_from_parser_and_opt
    return script.run()
  File "e:\thesis\parlai\parlai\scripts\display_data.py", line 108, in run
    return display_data(self.opt)
  File "e:\thesis\parlai\parlai\scripts\display_data.py", line 70, in display_data
    world = create_task(opt, agent)
  File "e:\thesis\parlai\parlai\core\worlds.py", line 1263, in create_task
    world = create_task_world(opt, user_agents, default_world=default_world)
  File "e:\thesis\parlai\parlai\core\worlds.py", line 1227, in create_task_world
    task_agents = _create_task_agents(opt)
  File "e:\thesis\parlai\parlai\core\worlds.py", line 1215, in _create_task_agents
    return create_task_agent_from_taskname(opt)
  File "e:\thesis\parlai\parlai\core\teachers.py", line 2454, in create_task_agent_from_taskname
    task_agents = teacher_class(opt)
  File "e:\thesis\parlai\parlai\tasks\babi\agents.py", line 51, in __init__
    opt['datafile'] = _path('', self.task_num, opt)
  File "e:\thesis\parlai\parlai\tasks\babi\agents.py", line 15, in _path
    build(opt)
  File "e:\thesis\parlai\parlai\tasks\babi\build.py", line 35, in build
    downloadable_file.download_file(dpath)
  File "e:\thesis\parlai\parlai\core\build_data.py", line 92, in download_file
    untar(dpath, self.file_name)
  File "e:\thesis\parlai\parlai\core\build_data.py", line 255, in untar
    return _untar(path, fname, delete=delete, flatten=flatten_tar)
  File "e:\thesis\parlai\parlai\core\build_data.py", line 300, in _untar
    PathManager.rm(fullpath)
  File "C:\ProgramData\Anaconda3\lib\site-packages\iopath-0.1.2-py3.8.egg\iopath\common\file_io.py", line 815, in rm
  File "C:\ProgramData\Anaconda3\lib\site-packages\iopath-0.1.2-py3.8.egg\iopath\common\file_io.py", line 472, in _rm
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'E:\\thesis\\ParlAI\\data\\bAbI\\babi.tar.gz'
stephenroller commented 3 years ago

See #3185 for a workaround.

stephenroller commented 3 years ago

I've put up a more permanent fix in #3340.