Open noklam opened 1 month ago
The pre-requisite for building this is to store the YAML line no/cursor position of the catalog entry at load-time.
For reference, ruamel.yaml
provides the cursor information:
In [11]: from ruamel.yaml import YAML
In [12]: yaml = YAML()
In [13]: data = yaml.load("""
...: # testing line and column based on SO
...: # http://stackoverflow.com/questions/13319067/
...: - key1: item 1
...: key2: item 2
...: - key3: another item 1
...: key4: another item 2
...: """)
In [14]: data
Out[14]: [{'key1': 'item 1', 'key2': 'item 2'}, {'key3': 'another item 1', 'key4': 'another item 2'}]
In [15]: type(data)
Out[15]: ruamel.yaml.comments.CommentedSeq
In [16]: data[0].lc
Out[16]: LineCol(3, 7)
In [17]: type(data[0])
Out[17]: ruamel.yaml.comments.CommentedMap
It was mentioned that some YAML loader preserve metadata about line
and col
from https://github.com/kedro-org/kedro/issues/2821#issuecomment-1845199192. After some investigation, it is less strictforward than expected. The example is a list, so you can do data[0].lc
where data[0]
is a CommentedSeq
object. You can do similar things if the value is a dictionary. It doesn't work if it's a simple str or int. For example
# abc.yml
a: 1
b: 2
yaml = YAML()
with open("abc.yml") as f:
data = yaml.load(f)
>>> data["a"]
1
In this case, data["a"] returns an int directly and all the metadata are lost.
Rough idea:
__line__
attribute__line__
metadata, then update the following constructors:OmegaConf.create
OmegaConf._create_impl
DictConfig.__init__