kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.53k stars 879 forks source link

%load_node truncates import statements #3760

Open natashadunstan opened 4 months ago

natashadunstan commented 4 months ago

Description

Hi, it seems to be broken when the import statement has multiple lines. My imports look like this:

from a import (
                         a,
                         b,
                         c
                         )

But in the cell that %load_node created, it got cut off at first line:

from a import (

Context

Cell couldn't be run, had to manually fix import

Steps to Reproduce

@noklam to update

  1. Create a node with similar format.
from os import (path,
                 os
                 )
  1. %load_node on that specific node
  2. See that the import statement get truncated as from os import (path,

Expected Result

Full import statement in cell

Actual Result

-- If you received an error, place it here.
-- Separate them if you have more than one.

Your Environment

noklam commented 4 months ago

Thanks for reporting. I think this happens because right now we extract the import statements naively with the keywords start with either "from" or "import".

Quick thought on this is that we can add some logic to catch this, but it maybe quite fragile and still runs into edge cases.

Alternative is that we need a proper parser or look at the AST.

merelcht commented 3 months ago

Discussed in backlog grooming that initially we'll just fix this for the base case for the time being. It would be nice to find the edge cases eventually but for now we can stick to the basic version to get it up and running