Avaiga / taipy

Turns Data and AI algorithms into production-ready web applications in no time.
https://www.taipy.io
Apache License 2.0
10.94k stars 775 forks source link

BUG- <DataNode>.is_up_to_date raises error when never written before #1198

Closed FlorianJacta closed 4 months ago

FlorianJacta commented 4 months ago

Description

The <DataNode>.is_up_to_date method breaks when the Data Node has never been written.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\Users\jacta\.conda\envs\bar-cutting\lib\site-packages\taipy\core\data\data_node.py", line 528, in is_up_to_date
    and ancestor_node.last_edit_date > self.last_edit_date
TypeError: '>' not supported between instances of 'datetime.datetime' and 'NoneType'

How to reproduce

from taipy.config import Config
import taipy as tp

Config.configure_job_executions(mode="standalone")

# Normal function used by Taipy
def double(nb):
    return nb * 2

# Configuration of Data Nodes
input_cfg = Config.configure_data_node(id="somedata", default_data=21)
output_cfg = Config.configure_data_node(id="result")

# Configuration of tasks
first_task_cfg = Config.configure_task(id="double",
                                       function=double,
                                       input=input_cfg,
                                       output=output_cfg)

# Configuration of scenario
scenario_cfg = Config.configure_scenario(id="my_scenario",
                                         task_configs=[first_task_cfg],
                                         name="my_scenario")

if __name__=="__main__":
    tp.Core().run()
    scenario_1 = tp.create_scenario(scenario_cfg)

    print(scenario_1.result.is_up_to_date)

Expected behavior We should not get an error; the result should be False if the Data Node is invalid.

Possible solution:

Change the is_up_to_date function in taipy/core/data/data_node.py, line 510:

@property
def is_up_to_date(self) -> bool:
    """Indicate if this data node is up-to-date.

    Returns:
        False if a preceding data node has been updated before the selected data node
        or the selected data is invalid.<br/>
        True otherwise.
    """

    if self.is_valid:
        from ..scenario.scenario import Scenario
        from ..taipy import get_parents

        parent_scenarios: Set[Scenario] = get_parents(self)["scenario"]  # type: ignore
        for parent_scenario in parent_scenarios:
            for ancestor_node in nx.ancestors(parent_scenario._build_dag(), self):
                if (
                    isinstance(ancestor_node, DataNode)
                    and ancestor_node.last_edit_date
                    and ancestor_node.last_edit_date > self.last_edit_date
                ):
                    return False
    return False

Runtime environment

Acceptance Criteria

jrobinAV commented 4 months ago

This should target both develop and 3.1 branches.

yaten2302 commented 4 months ago

Hey, @jrobinAV, I'm not really familiar with python, but may I give this issue a try? So, like we've to change return self.is_valid to return False in data_node.py, right?

FlorianJacta commented 4 months ago

More than just that. You have to replace the whole function with the one written above @yaten2302 Is that right? @jrobinAV

jrobinAV commented 4 months ago

@yaten2302 Thank you for helping us on this issue.

Today's code for the method is_up_to_date does not work. The problem is that the code assumes the last_edit_date is populated, meaning the data node has already been written at least once. This is not always true.

The code that @FlorianJacta proposed in the description should solve the issue. So, the main objective of this issue is to:

yaten2302 commented 4 months ago

Hey @jrobinAV, thanks for assigning this issue to me 👍 @FlorianJacta I've replaced the whole function data_node.py with the one written above. But, I've a doubt, do we've to create a separate branch or we can work directly on the default branch i.e. develop? Also, could you please guide that how to create tests, as I'm not familiar with that?

yaten2302 commented 4 months ago

Description

The <DataNode>.is_up_to_date method breaks when the Data Node has never been written.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\Users\jacta\.conda\envs\bar-cutting\lib\site-packages\taipy\core\data\data_node.py", line 528, in is_up_to_date
    and ancestor_node.last_edit_date > self.last_edit_date
TypeError: '>' not supported between instances of 'datetime.datetime' and 'NoneType'

How to reproduce

from taipy.config import Config
import taipy as tp

Config.configure_job_executions(mode="standalone")

# Normal function used by Taipy
def double(nb):
    return nb * 2

# Configuration of Data Nodes
input_cfg = Config.configure_data_node(id="somedata", default_data=21)
output_cfg = Config.configure_data_node(id="result")

# Configuration of tasks
first_task_cfg = Config.configure_task(id="double",
                                       function=double,
                                       input=input_cfg,
                                       output=output_cfg)

# Configuration of scenario
scenario_cfg = Config.configure_scenario(id="my_scenario",
                                         task_configs=[first_task_cfg],
                                         name="my_scenario")

if __name__=="__main__":
    tp.Core().run()
    scenario_1 = tp.create_scenario(scenario_cfg)

    print(scenario_1.result.is_up_to_date)

Expected behavior We should not get an error; the result should be False if the Data Node is invalid.

Possible solution:

Change the _is_up_todate function in _taipy/core/data/datanode.py, line 510:

@property
def is_up_to_date(self) -> bool:
    """Indicate if this data node is up-to-date.

    Returns:
        False if a preceding data node has been updated before the selected data node
        or the selected data is invalid.<br/>
        True otherwise.
    """

    if self.is_valid:
        from ..scenario.scenario import Scenario
        from ..taipy import get_parents

        parent_scenarios: Set[Scenario] = get_parents(self)["scenario"]  # type: ignore
        for parent_scenario in parent_scenarios:
            for ancestor_node in nx.ancestors(parent_scenario._build_dag(), self):
                if (
                    isinstance(ancestor_node, DataNode)
                    and ancestor_node.last_edit_date
                    and ancestor_node.last_edit_date > self.last_edit_date
                ):
                    return False
    return False

Runtime environment

  • Taipy: 3.1.1

Acceptance Criteria

  • [ ] Ensure new code is unit tested, and check code coverage is at least 90%
  • [ ] #1200

Also, when I was trying to reproduce this issue, it's showing this error in the installation of taipy on running command - pip install taipy
image

FlorianJacta commented 4 months ago

Try to create an entirely new Python environment. Here is how you contribute to Taipy (doc) You have to create a branch and create a Pull Request that we will review

yaten2302 commented 4 months ago

Hey @FlorianJacta, I've created a draft PR, could you please review it and let me know if any changes are required. PR link

FlorianJacta commented 4 months ago

Thank you, the R&D will look into it!