Kaszanas / SC2_Datasets

https://sc2-datasets.readthedocs.io/
GNU General Public License v3.0
8 stars 3 forks source link

JSONDecodeError Expecting value... #16

Closed Kaszanas closed 1 year ago

Kaszanas commented 2 years ago

When attempting to train with these parameters:

    datamodule = SC2ReplaypackDataModule(
        transform=economy_average_vs_outcome,
        replaypack_name="2020_IEM_Katowice",
        replaypack_unpack_dir="D:/Projects/SC2EGSet_Experiments/test/test_files/unpack",
        download=False,
        batch_size=256,
        num_workers=4,
    )
    logistic_regression = LogisticRegression(input_dim=2 * 39, num_classes=2)
    trainer = pl.Trainer(
        logger=True,
        accelerator="gpu",
        devices=1,
        auto_select_gpus=True,
        max_epochs=10,
        log_every_n_steps=2,
    )

The following error was raised:

Traceback (most recent call last):
  File "D:\Projects\SC2EGSet_Experiments\src\experiments\logistic_regression.py", line 51, in <module>
    trainer.fit(model=logistic_regression, datamodule=datamodule)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 740, in fit    self._call_and_handle_interrupt(
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 777, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1199, in _run
    self._dispatch()
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1279, in _dispatch
    self.training_type_plugin.start_training(self)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 202, in start_training
    self._results = trainer.run_stage()
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1289, in run_stage
    return self._run_train()
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1311, in _run_train
    self._run_sanity_check(self.lightning_module)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1375, in _run_sanity_check
    self._evaluation_loop.run()
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\loops\base.py", line 145, in run     
    self.advance(*args, **kwargs)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\loops\dataloader\evaluation_loop.py", line 110, in advance
    dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\loops\base.py", line 140, in run     
    self.on_run_start(*args, **kwargs)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\loops\epoch\evaluation_epoch_loop.py", line 86, in on_run_start
    self._dataloader_iter = _update_dataloader_iter(data_fetcher, self.batch_progress.current.ready)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\loops\utilities.py", line 121, in _update_dataloader_iter
    dataloader_iter = enumerate(data_fetcher, batch_idx)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\utilities\fetching.py", line 199, in 
__iter__
    self.prefetching(self.prefetch_batches)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\utilities\fetching.py", line 258, in 
prefetching
    self._fetch_next_batch()
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\pytorch_lightning\utilities\fetching.py", line 300, in 
_fetch_next_batch
    batch = next(self.dataloader_iter)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\utils\data\dataloader.py", line 530, in __next__ 
    data = self._next_data()
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\utils\data\dataloader.py", line 1224, in _next_data
    return self._process_data(data)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\utils\data\dataloader.py", line 1250, in _process_data
    data.reraise()
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\_utils.py", line 456, in reraise
    raise RuntimeError(msg) from None
RuntimeError: Caught JSONDecodeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\utils\data\_utils\worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in fetch   
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "D:\Projects\SC2EGSet_Experiments\venv_3_10\lib\site-packages\torch\utils\data\dataset.py", line 471, in __getitem__ 
    return self.dataset[self.indices[idx]]
  File "D:\Projects\SC2EGSet_Experiments\src\dataset\pytorch_datasets\sc2_replaypack_dataset.py", line 97, in __getitem__   
    replay_data = SC2ReplayData.from_file(replay_filepath=self.list_of_files[index])
  File "D:\Projects\SC2EGSet_Experiments\src\dataset\replay_data\sc2_replay_data.py", line 46, in from_file
    loaded_data = json.load(replay_file)
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\json\__init__.py", line 293, in load
    return loads(fp.read(),
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 128419 column 17 (char 2461123)
Kaszanas commented 2 years ago

This happened for the following file: '7cba2d086ef64206bb2d15721bb033e6.SC2Replay.json'

And this is the contents which are there:

    {
      "distance": null,
      "evtTypeName": "CameraUpdate",
      "follow": f!lse,
      "id": 49,
      "loop": 5461,
      "pitch": null,
      "reason": null,
      "target": {
        "x": 3.921875,
        "y": 0.9873046875
      },
      "userid": {
        "userId": 3
      },
      "yaw": null
    },

This error is not due to the code in this repository and should be moved to where the replays are parsed. fyi: @leafnode

Kaszanas commented 2 years ago

Some more faulty data:

ERROR:root:JSONDecodeError was raised for file G:\Projects\SC2EGSet_Experiments\test\test_files\unpack\2021_Dreamhack_SC2_Masters_Fall\2021_Dreamhack_SC2_Masters_Fall_data\6e8cb19cebd044a3af66d54f0e5d8c45.SC2Replay.json  
Traceback (most recent call last):
  File "G:\Projects\SC2EGSet_Experiments\src\dataset\replay_data\sc2_replay_data.py", line 46, in from_file   
    loaded_data = json.load(replay_file)
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\json\__init__.py", line 293, in load
    return loads(fp.read(),
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\kasza\.pyenv\pyenv-win\versions\3.10.2\lib\json\decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 48886 column 7 (char 937412)
    {
      "distance": null,
      "evtTypeName": "CameraUpdate",
      "follow": false,
      "id": 49,
      "loop": 1135,
      "pitch": null,
      "reason": null,
      "target": {
        "x": 2.0841064453125,
        "y": 3.4232177734375
      },
      "userid": {
        "userId": 7
      =,
      "yaw": null
    },

fyi: @leafnode

Kaszanas commented 2 years ago

These are bitflips.

Kaszanas commented 2 years ago

Validators will be solving this issue. Overall such files will be skipped.