Closed LourencoVazPato closed 2 years ago
The parsing depends on the type hints. What is the type hint for dataset_path
? If it is str
, after parsing the value should be a string with whatever you have in the command line. If the type is a path like the docs you linked, the parsed value will not be str
but a Path
object.
If you don't have type hints, then add them. This is how the parser knows how to validate. In LightningCLI it is configured such that when there is no type hint, it defaults to Any
. If the type is Any
then the parser does not know that it should be the path to a json file or its contents.
In my LightningDataModule, the type hint for dataset
is
dataframe_or_data_path: Optional[
Union[pd.DataFrame, List[PathLike], PathLike]
] = None
where PathLike = Union[str, os.PathLike]
.
Could this be an issue?
With the following:
import os
from jsonargparse import ArgumentParser
from typing import Any, List, Union
parser = ArgumentParser()
PathLike = Union[str, os.PathLike]
parser.add_argument('--path', type=Union[List[PathLike], PathLike])
cfg = parser.parse_args(['--path=issue_159.json'])
print(cfg)
The result is correct Namespace(path='issue_159.json')
.
os.PathLike
is not currently supported. But since the union has str
there is no issue in that case. Support for os.PathLike
can be added. But still this does not explain what you originally described. You will need to post a minimal reproducible script.
@mauvilsa I updated the description with the code to reproduce. I have initialized the arguments with .add_subclass_arguments()
as in pytorch-lightning v1.6.5.
@LourencoVazPato thank you for adding the reproduction code. Unfortunately I have been unable to reproduce it. First I tried in a normal virtual environment and then I tried with poetry to be a close as possible to what you reported. In Ubuntu 20.04 I get:
$ poetry run python issue_159.py
The currently activated Python version 3.8.10 is not supported by the project (^3.9).
Trying to find and use a compatible version.
Using python3.9 (3.9.13)
Namespace(data=Namespace(class_path='__main__.SubDataModule', init_args=Namespace(data_path='data.json')))
Namespace(data=<__main__.SubDataModule object at 0x7f7c5f185c70>)
The pyproject.toml
is the following:
[tool.poetry]
name = "issue-159"
version = "0.1.0"
description = ""
authors = []
readme = "README.md"
packages = [{include = "issue_159"}]
[tool.poetry.dependencies]
python = "^3.9"
jsonargparse = "4.13.0"
pandas = "^1.4.4"
pytorch-lightning = "1.6.5"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
@mauvilsa I've been able to reproduce with this poetry env:
[tool.poetry]
name = "issue-159"
version = "0.1.0"
description = ""
authors = []
[tool.poetry.dependencies]
python = "^3.9"
jsonargparse = {extras = ["signatures"], version = "4.13.0"}
pandas = "^1.4.4"
pytorch-lightning = "1.6.5"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
If you want to reproduce my python env here's the poetry lock.
I had a silly mistake when trying to reproduce. I was able to reproduce it in a normal virtual environment and any python version. I have pushed the fix in commit https://github.com/omni-us/jsonargparse/commit/744c0c1105a02a3dc3845d6488189cb22ca684ee.
🐛 Bug report
Not sure if a bug or a feature, but when I call a script (e.g. PyTorch-Lightning CLI) with an argument like
--dataset_path *.json
, the parser reads the json file and interprets it as a configuration file (not as a dataset file in this case), and errors out because it is not a valid config file.I can see there's documentation on parsing file paths, but cannot find any reference on reading them as string arguments.
Is it possible to disable this parsing? What other alternatives are there?
Thanks in advance
To reproduce
Create a
data.json
file containing some JSON dataset. E.g.:Errors out with:
Curiously, it passes if I use a
data.csv
instead of JSON.Expected behavior
The parser reads the arg value as a
str
.Environment
pip install jsonargparse[all]
): poetry add jsonargparse