databrickslabs / dbx

🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.
https://dbx.readthedocs.io
Other
437 stars 119 forks source link

dbx doesn't like Jinja Macros #804

Open aviadshimoni opened 1 year ago

aviadshimoni commented 1 year ago

Expected Behavior

There is a lot of boilerplate code in the deployment.yaml that we (DevOps) want to mask from the developers and put it in a common repo (DRY). for that, I've created a common_function.yaml file and put jinja functions there. image

Current Behavior

image

Steps to Reproduce (for bugs)

Create common_function.yaml, pass with to dbx as --jinja-variables-file and reference functions from deployment.yaml.

Context

If I've a different solution to implement my goal, I would love to hear that.

Your Environment

aviadshimoni commented 1 year ago

@renardeinside hi Ivan, now I see that I used the variables file incorrectly since it doesn't support Jinja syntax. https://dbx.readthedocs.io/en/latest/features/jinja_support/#custom-functions-support image

gcp attributes works, but the get_cluster common doesn't, so as I see it I need to call functions after the ':' and add the whole value for the property.

any idea when this feature will be stable and not experimental? we want to build the DRY solution based on this but afraid it would break.

aviadshimoni commented 1 year ago

dbx logs from example above:

 [dbx][2023-06-26 09:59:20.385] 🔌 Found custom Jinja functions defined in .dbx/_custom_jinja_functions.py, loading them
[dbx][2023-06-26 09:59:20.386] ✅ Custom Jinja functions successfully loaded
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /usr/local/lib/python3.10/site-packages/dbx/commands/deploy.py:111 in deploy │
│                                                                              │
│   108 │   if not branch_name:                                                │
│   109 │   │   branch_name = get_current_branch_name()                        │
│   110 │                                                                      │
│ ❱ 111 │   config_reader = ConfigReader(deployment_file, jinja_variables_file │
│   112 │   config = config_reader.with_build_properties(                      │
│   113 │   │   BuildProperties(potential_build=True, no_rebuild=no_rebuild)   │
│   114 │   ).get_config()                                                     │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/dbx/api/config_reader.py:122 in      │
│ __init__                                                                     │
│                                                                              │
│   119 │   def __init__(self, path: Path, jinja_vars_file: Optional[Path] = N │
│   120 │   │   self._jinja_vars_file = jinja_vars_file                        │
│   121 │   │   self._path = path                                              │
│ ❱ 122 │   │   self._reader = self._define_reader()                           │
│   123 │   │   self._build_properties = BuildProperties()                     │
│   124 │                                                                      │
│   125 │   def with_build_properties(self, build_properties: BuildProperties) │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/dbx/api/config_reader.py:140 in      │
│ _define_reader                                                               │
│                                                                              │
│   137 │   │   │   │   )                                                      │
│   138 │   │   │   │   return Jinja2ConfigReader(self._path, ext=self._path.s │
│   139 │   │   elif ProjectConfigurationManager().get_jinja_support():        │
│ ❱ 140 │   │   │   return Jinja2ConfigReader(self._path, ext=self._path.suffi │
│   141 │   │   else:                                                          │
│   142 │   │   │   if self._jinja_vars_file:                                  │
│   143 │   │   │   │   raise Exception(                                       │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/dbx/api/config_reader.py:64 in       │
│ __init__                                                                     │
│                                                                              │
│    61 │   def __init__(self, path: Path, ext: str, jinja_vars_file: Optional │
│    62 │   │   self._ext = ext                                                │
│    63 │   │   self._jinja_vars_file = jinja_vars_file                        │
│ ❱  64 │   │   super().__init__(path)                                         │
│    65 │                                                                      │
│    66 │   @staticmethod                                                      │
│    67 │   def _read_vars_file(file_path: Path) -> Dict[str, Any]:            │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/dbx/api/config_reader.py:23 in       │
│ __init__                                                                     │
│                                                                              │
│    20 class _AbstractConfigReader(ABC):                                      │
│    21 │   def __init__(self, path: Path):                                    │
│    22 │   │   self._path = path                                              │
│ ❱  23 │   │   self.config = self.get_config()                                │
│    24 │                                                                      │
│    25 │   def get_config(self) -> DeploymentConfig:                          │
│    26 │   │   return self._read_file()                                       │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/dbx/api/config_reader.py:26 in       │
│ get_config                                                                   │
│                                                                              │
│    23 │   │   self.config = self.get_config()                                │
│    24 │                                                                      │
│    25 │   def get_config(self) -> DeploymentConfig:                          │
│ ❱  26 │   │   return self._read_file()                                       │
│    27 │                                                                      │
│    28 │   @abstractmethod                                                    │
│    29 │   def _read_file(self) -> DeploymentConfig:                          │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/dbx/api/config_reader.py:101 in      │
│ _read_file                                                                   │
│                                                                              │
│    98 │   │   │   _content = json.loads(rendered)                            │
│    99 │   │   │   return JsonConfigReader.read_content(_content)             │
│   100 │   │   elif self._ext in [".yml", ".yaml"]:                           │
│ ❱ 101 │   │   │   _content = yaml.load(rendered, yaml.SafeLoader)            │
│   102 │   │   │   return DeploymentConfig.from_payload(_content)             │
│   103 │   │   else:                                                          │
│   104 │   │   │   raise Exception(f"Unexpected extension for Jinja reader: { │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/yaml/__init__.py:81 in load          │
│                                                                              │
│    78 │   """                                                                │
│    79 │   loader = Loader(stream)                                            │
│    80 │   try:                                                               │
│ ❱  81 │   │   return loader.get_single_data()                                │
│    82 │   finally:                                                           │
│    83 │   │   loader.dispose()                                               │
│    84                                                                        │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/yaml/constructor.py:49 in            │
│ get_single_data                                                              │
│                                                                              │
│    46 │                                                                      │
│    47 │   def get_single_data(self):                                         │
│    48 │   │   # Ensure that the stream contains a single document and constr │
│ ❱  49 │   │   node = self.get_single_node()                                  │
│    50 │   │   if node is not None:                                           │
│    51 │   │   │   return self.construct_document(node)                       │
│    52 │   │   return None                                                    │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/yaml/composer.py:36 in               │
│ get_single_node                                                              │
│                                                                              │
│    33 │   │   # Compose a document if the stream is not empty.               │
│    34 │   │   document = None                                                │
│    35 │   │   if not self.check_event(StreamEndEvent):                       │
│ ❱  36 │   │   │   document = self.compose_document()                         │
│    37 │   │                                                                  │
│    38 │   │   # Ensure that the stream contains no more documents.           │
│    39 │   │   if not self.check_event(StreamEndEvent):                       │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/yaml/composer.py:55 in               │
│ compose_document                                                             │
│                                                                              │
│    52 │   │   self.get_event()                                               │
│    53 │   │                                                                  │
│    54 │   │   # Compose the root node.                                       │
│ ❱  55 │   │   node = self.compose_node(None, None)                           │
│    56 │   │                                                                  │
│    57 │   │   # Drop the DOCUMENT-END event.                                 │
│    58 │   │   self.get_event()                                               │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/yaml/composer.py:84 in compose_node  │
│                                                                              │
│    81 │   │   elif self.check_event(SequenceStartEvent):                     │
│    82 │   │   │   node = self.compose_sequence_node(anchor)                  │
│    83 │   │   elif self.check_event(MappingStartEvent):                      │
│ ❱  84 │   │   │   node = self.compose_mapping_node(anchor)                   │
│    85 │   │   self.ascend_resolver()                                         │
│    86 │   │   return node                                                    │
│    87                                                                        │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/yaml/composer.py:133 in              │
│ compose_mapping_node                                                         │
│                                                                              │
│   130 │   │   │   #if item_key in node.value:                                │
│   131 │   │   │   #    raise ComposerError("while composing a mapping", star │
│   132 │   │   │   #            "found duplicate key", key_event.start_mark)  │
│ ❱ 133 │   │   │   item_value = self.compose_node(node, item_key)             │
│   134 │   │   │   #node.value[item_key] = item_value                         │
│   135 │   │   │   node.value.append((item_key, item_value))                  │
│   136 │   │   end_event = self.get_event()                                   │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/yaml/composer.py:84 in compose_node  │
│                                                                              │
│    81 │   │   elif self.check_event(SequenceStartEvent):                     │
│    82 │   │   │   node = self.compose_sequence_node(anchor)                  │
│    83 │   │   elif self.check_event(MappingStartEvent):                      │
│ ❱  84 │   │   │   node = self.compose_mapping_node(anchor)                   │
│    85 │   │   self.ascend_resolver()                                         │
│    86 │   │   return node                                                    │
│    87                                                                        │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/yaml/composer.py:127 in              │
│ compose_mapping_node                                                         │
│                                                                              │
│   124 │   │   │   │   flow_style=start_event.flow_style)                     │
│   125 │   │   if anchor is not None:                                         │
│   126 │   │   │   self.anchors[anchor] = node                                │
│ ❱ 127 │   │   while not self.check_event(MappingEndEvent):                   │
│   128 │   │   │   #key_event = self.peek_event()                             │
│   129 │   │   │   item_key = self.compose_node(node, None)                   │
│   130 │   │   │   #if item_key in node.value:                                │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/yaml/parser.py:98 in check_event     │
│                                                                              │
│    95 │   │   # Check the type of the next event.                            │
│    96 │   │   if self.current_event is None:                                 │
│    97 │   │   │   if self.state:                                             │
│ ❱  98 │   │   │   │   self.current_event = self.state()                      │
│    99 │   │   if self.current_event is not None:                             │
│   100 │   │   │   if not choices:                                            │
│   101 │   │   │   │   return True                                            │
│                                                                              │
│ /usr/local/lib/python3.10/site-packages/yaml/parser.py:438 in                │
│ parse_block_mapping_key                                                      │
│                                                                              │
│   435 │   │   │   │   return self.process_empty_scalar(token.end_mark)       │
│   436 │   │   if not self.check_token(BlockEndToken):                        │
│   437 │   │   │   token = self.peek_token()                                  │
│ ❱ 438 │   │   │   raise ParserError("while parsing a block mapping", self.ma │
│   439 │   │   │   │   │   "expected <block end>, but found %r" % token.id, t │
│   440 │   │   token = self.get_token()                                       │
│   441 │   │   event = MappingEndEvent(token.start_mark, token.end_mark)      │
╰──────────────────────────────────────────────────────────────────────────────╯
ParserError: while parsing a block mapping
  in "<unicode string>", line 11, column 3:
      google_service_account_gcp_attri ... 
      ^
expected <block end>, but found '<block mapping start>'
  in "<unicode string>", line 26, column 5:
        gcp_attributes: {'zone_id': 'aut ... 
        ^
Cleaning up project directory and file based variables 00:01
ERROR: Job failed: command terminated with exit code 1
aviadshimoni commented 1 year ago

Another issue trying to implement https://dbx.readthedocs.io/en/latest/features/jinja_support/#support-for-includes:

this is my gcp_attributes_yaml.j2 file: zone_id: auto use_preemptible_executors: true availability: PREEMPTIBLE_WITH_FALLBACK_GCP

this is how I reference it in my deployment.yaml: ` job_clusters:

are you sure the include works for YAML files and not JSON only?

cristian-rincon commented 11 months ago

Another issue trying to implement https://dbx.readthedocs.io/en/latest/features/jinja_support/#support-for-includes:

this is my gcp_attributes_yaml.j2 file: zone_id: auto use_preemptible_executors: true availability: PREEMPTIBLE_WITH_FALLBACK_GCP

this is how I reference it in my deployment.yaml: job_clusters: - job_cluster_key: "{{env['SCHEMA']~'_'~env['TABLE_NAME']~'_etl'}}" new_cluster: gcp_attributes: { % include 'includes/gcp_attributes.yaml.j2' % } trying moving the include one line down getting this error: ScannerError: mapping values are not allowed here in "", line 26, column 32: gcp_attributes: zone_id: auto

are you sure the include works for YAML files and not JSON only?

On the same page, is it possible to include support for 'includes' in the yaml templates?