aws-solutions / workload-discovery-on-aws

Workload Discovery on AWS is a solution to visualize AWS Cloud workloads. With it you can build, customize, and share architecture diagrams of your workloads based on live data from AWS. The solution maintains an inventory of the AWS resources across your accounts and regions, mapping their relationships and displaying them in the user interface.
https://aws.amazon.com/solutions/implementations/workload-discovery-on-aws/
Apache License 2.0
727 stars 88 forks source link

Need Guidance on using this tool and facing issues while exporting the diagram. #509

Closed gokul-industrility closed 9 months ago

gokul-industrility commented 9 months ago

If your issue relates to the Discovery Process, please first follow the steps described in the implementation guide Debugging the Discovery Component


Describe the bug First, I need some guidance from the team on how touse this tool. please let me know whoever is available so that we can connect and discuss Another issue is whenever we generate this diagaram we were not able to export the diagram, even if we export it is still showing previous diagram only.

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A description of what you expected to happen.

Screenshots Attached screenshot is the export related error. image

Browser (please complete the following information):

Additional context I need someone who can help me how to use this tool, becuase I am facing too many issues using this tool and behaviour is not expected so please let me know so that we can connect for 30mins.

svozza commented 9 months ago

Unfortunately, as a team, we do not have the capacity to do individual sessions with the tool. However, I have written a blog post on the AWS blog site with detailed instructions on how to use the tool: https://aws.amazon.com/blogs/mt/visualizing-resources-with-workload-discovery-on-aws/.

Regarding your issue, I have not seen this saving issue before before. Are you saying that when you save your diagram after expanding some new resources on the diagram, that the old diagram in its previous state is still showing when you export it? Can you you save a newly expanded diagram with your browser dev tools open and verify that the save to S3 is not erroring? Wait until the the green flashbar with the message Diagram saved. The private diagram <name> was saved successfully in the screenshot below appears to ensure the write to S3 was supposed to have occurred.

Steps:

  1. Create a new diagram and add some resources.
  2. Expand a resource by double clicking on it.
  3. Save the diagram by clicking Actions -> Save.
  4. Wait until the green flashbar appears.
  5. Verfiy that the network request to S3 did not fail.
  6. Export the newly expanded diagram.
Screenshot 2024-02-08 at 11 44 31
gokul-industrility commented 9 months ago

Thanks for your reply, please have a look at the attached screenshot.

image

svozza commented 9 months ago

As I explained in the other issue, this error is a known bug in the solution. We are working on a fix that will be included in our next release.

gokul-industrility commented 9 months ago

Earlier we were able to export the diagram without having any issues, but all of sudden it started showing previous version of diagram when we try to export the diagram and then now we are not even able to export the diagram.

Ok, understood this will be addressed in your next release but for now just to unblock me, do we have any hack, I need to export the generated diagram from the UI.

svozza commented 9 months ago

So the exact same diagram started failing having previously been working? I do not know the root cause of the bug yet but what I have noticed is that if a diagram has had resources removed from it that it increases the likelihood of this error appearing. By removed I mean using the Remove action in the screenshot here:

Screenshot 2024-02-08 at 13 41 54

Could that be the case here? The only 'workaround' I can think of if that is the case is to do the deletions in drawio. Unfortunately, if the diagram hasn't been changed by deleting resources then I don't have any other solution as of this moment.

gokul-industrility commented 9 months ago

I just wanted to create network diagram, as mentioned in the blog I have followed the steps and can able to generate the new diagram successfully, after generating it I can also able to remove certian resources in the diagram for example, I dont want tags, After successfully saving the diagram, I am not able to export into drawio.

if you dont mind, can we have a quick connect ? so that you can see my screen and understand what I am doing and where I am facing issues. It will also help me understand that am i following correct steps or not.Thanks.

svozza commented 9 months ago

You aren't doing anything wrong, what you just have described is the bug in the software. In my investigations, it has something to do with this line of code. If I can figure out what is going on here, I might be able to give you a patch for the code in the lambda function that creates the export to drawio.

svozza commented 9 months ago

I've been able to reproduce this error locally by adding a VPC to a diagram and then deleting its tags.

gokul-industrility commented 9 months ago

Sorry, I was away for sometime couldn't check your message. As per steps mentioned in the blog, just generating the diagram for VPC resource will have all other dependdecy components like subnets, route table, etc., that is why I selected only VPC resource type and generated the diagram.

okiee, any update ? is it working for you. If you want let's connect for 5 - 10mins and explore together.

svozza commented 9 months ago

Yes, I think I've managed to figure out what was causing the issue. To test this you will need to patch some code in a lambda function. Go the Lambda console in the region where WD is deployed and find the function with the string DrawIoExportResolver-DrawIoFunction in the name. Copy and paste the code below into the code editor for the main.py file and deploy the change:

# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

import urllib.parse
from xml.etree.ElementTree import Element, SubElement, tostring
from type_definitions import get_type_definitions
from zlib import compress
from base64 import b64encode
from operator import itemgetter

# standardized drawing margin based on cytoscape graphing library defaults
drawing_margin = 30
# Get a dictionary of icon styles based on cytoscape 'types'
types = get_type_definitions()

class Node:
    """
    Classes:
    - Node = stores hierarchical and label information for drawing nodes and edge
    in draw.io. Height and width values are dynamically created based on children
    - Edge = Connection info for nodes, produce arrows
    """

    def __init__(self, node_id, node_type, label, title, center_x, center_y, is_end_node):
        self.node_id = node_id
        self.node_type = node_type
        self.label = label
        self.title = title
        self.center_x = center_x
        self.center_y = center_y
        self.style = types[self.node_type]['style']
        self.is_end_node = is_end_node
        self.children = []

    @property
    def height(self):
        if self.is_end_node and 'height' in types[self.node_type]:
            return types[self.node_type]['height']
        elif len(self.children) == 0:
            return None
        else:
            children_points = list(
                filter(None, map(lambda c: (c.height*0.5 + c.center_y), self.children)))
            furthest_point = max(children_points)
            result = furthest_point + drawing_margin - self.y
            return result

    @property
    def width(self):
        if 'width' in types[self.node_type]:
            return types[self.node_type]['width']
        elif len(self.children) == 0:
            return None
        else:
            children_points = list(
                filter(None, map(lambda c: (c.width*0.5 + c.center_x), self.children)))
            furthest_point = max(children_points)
            result = 2*(furthest_point + drawing_margin - self.center_x)
            return result

    @property
    def x(self):
        if self.is_end_node:
            return self.center_x - 0.5*self.width
        elif len(self.children) == 0:
            return None
        else:
            children_points = list(
                filter(None, map(lambda c: (c.center_x - 0.5*c.width), self.children)))
            min_point = min(children_points)
            result = (min_point - drawing_margin)
            return result

    @property
    def y(self):
        if self.is_end_node:
            return self.center_y - 0.5*self.height
        elif len(self.children) == 0:
            return None
        else:
            children_points = list(
                filter(None, map(lambda c: (c.center_y - 0.5*c.height), self.children)))
            min_point = min(children_points)
            result = (min_point - drawing_margin)
            return result

    def add_child(self, child):
        self.children.append(child)

    def get_xml_object(self):
        # Draw IO Context
        icon = {'style': self.style, 'vertex': '1', 'parent': '1'}
        content = {
            'id': self.node_id,
            'label': self.label,
            self.node_type: self.title
        }
        coords = {
            'x': str(self.x),
            'y': str(self.y),
            'height': str(self.height),
            'width': str(self.width),
            'as': 'geometry'
        }
        # Build object
        obj = Element('object', content)
        styled_obj = SubElement(obj, 'mxCell', icon)
        # SubElement mutates styled_obj
        SubElement(styled_obj, 'mxGeometry', coords)

        return obj

class Edge:
    def __init__(self, edge_id, source, target):
        self.edge_id = edge_id
        self.source = source
        self.target = target
        self.style = types['edge']['style']

    def get_xml_object(self):
        content = {
            'id': self.edge_id,
            'style': self.style,
            'parent': '1',
            'source': self.source,
            'target': self.target,
            'edge': '1'
        }
        coords = {
            'relative': '1',
            'as': 'geometry'
        }
        obj = Element('mxCell', content)
        # SubElement mutates obj
        SubElement(obj, 'mxGeometry', coords)

        return obj

def handler(event, _):
    """
    Main Lambda Handler
    """
    node_dict = dict()

    args = event['arguments']
    nodes = args.get('nodes', [])
    edges = args.get('edges', [])

    for node in nodes:
        node_id, node_type, label, title, position = \
            itemgetter('id', 'type', 'label', 'title', 'position')(node)

        if node_type == 'resource' and 'image' in node:
            node_type = node['image'].split('/')[-1].split('.')[0]

        x = position['x']
        y = position['y']
        is_end_node = node['type'] == 'resource'
        node = Node(node_id, node_type, label, title, x, y, is_end_node)
        node_dict[node_id] = node

    for node in nodes:
        node_id = node['id']
        parent = node.get('parent')
        if parent and parent in node_dict:
            node_dict[parent].add_child(node_dict[node_id])

    elements = list(node_dict.values())

    for edge in edges:
        edge_id, source, target = itemgetter('id', 'source', 'target')(edge)
        edge = Edge(edge_id, source, target)

        elements.append(edge)

    xml_output = produce_xml_output(elements)

    # Compress and encode XML tree
    xml_output_compressed_encoded = deflate_and_base64_encode(xml_output)
    # Convert XML encoded string to URL encoded string
    xml_output_url = urllib.parse.quote(xml_output_compressed_encoded, safe='')
    # Attach XML string to Draw IO URL (Note: Draw IO is not app.diagram.net due to .io vulnerabilities)
    drawio_url = 'https://app.diagrams.net?title=AWS%20Architecture%20Diagram.xml#R' + xml_output_url

    return drawio_url

def produce_xml_output(elements):
    """
    Helper Functions:
    - produce_xml_output = creates XML tree of all diagram nodes and edges
    - deflate_and_base64_encode = returns a compressed, encoded version of XML tree string to pass to Draw IO URL
    """
    # Initialize Parent Nodes in Draw.IO XML Tree
    xml_model = Element('mxGraphModel')
    root = SubElement(xml_model, 'root')

    # Draw IO needs two default cells to start drawing
    default_cell_contents = {'id': '0'}

    # SubElement mutates root
    SubElement(root, 'mxCell', default_cell_contents)
    default_cell_contents = {'id': '1', 'parent': '0'}
    SubElement(root, 'mxCell', default_cell_contents)

    for elem in elements:
        xml_object = elem.get_xml_object()
        root.append(xml_object)

    xml_output = tostring(xml_model)
    return xml_output

def deflate_and_base64_encode(string_val):
    zlibbed_str = compress(string_val)
    compressed_string = zlibbed_str[2:-4]
    return b64encode(compressed_string)

If your diagram is quite big you might also want to increase the memory of the lambda to 1024 as this is quite a computationally expensive thing to do as the diagrams grow in size.

Just a note, I am based in the UK so I may not see your reply until tomorrow morning.

mc-gokulpalani commented 9 months ago

Thank you for the reply, I have changed the code, now i can able to export the diagram, I have also increased the memeory of the lambda as you mentioned 1024. I am wating for WD solution to discover all the resource, then I will try export actual diagram and update my response in the same ticket thanks.

I have few clarifications, please clear my doubt.

There are 8000 resources in the account, First I wanted to generate the network diagram then infrastructure diagram, I am also facing an issue while applying the resource filter in the generated diagram for example remove certain resources like tag from the diagram so for which lambda functuon I should increase the memory to fix this issue ?. I think its memory issue only so asking.

Since customer has 8000 resources in the account, keeping AWS config recording on for all the time will incur cost for them, config is a costly service, so can I set data retention for my config is 30 days and turn off the recording once resources are getting discovered in the WD solution so that config data will be available for next 30 days and I can also save cost ?

svozza commented 9 months ago

There are 8000 resources in the account, First I wanted to generate the network diagram then infrastructure diagram, I am also facing an issue while applying the resource filter in the generated diagram for example remove certain resources like tag from the diagram so for which lambda functuon I should increase the memory to fix this issue ?. I think its memory issue only so asking.

I'm not sure I understand. What is the issue you are seeing? Bear in mind, that trying to create a diagram with 8000 resources will make the UI completely unusable because it is so computationally expensive to render: the browser will almost certainly hang. The solution is designed to diagram specific workloads, with hundreds of resources, not whole accounts with thousands of resources in a single diagram.

Since customer has 8000 resources in the account, keeping AWS config recording on for all the time will incur cost for them, config is a costly service, so can I set data retention for my config is 30 days and turn off the recording once resources are getting discovered in the WD solution so that config data will be available for next 30 days and I can also save cost ?

The best way to control AWS Config costs is to reduce the recording frequency of items. This was recently announced around Re:Invent. That way instead of Con fig recording a change every time a resource changes, it only does it once a day for all resources. You can further minimise costs of WD by switching to Neptune serverless and changing the frequency of the discovery process to only run ever 24 hours too. You can set this by changing the DiscoveryTaskFrequency CloudFormation paramater, there is more details on this in the docs.

mc-gokulpalani commented 9 months ago

Got it, thanks for the reply.

I understood that instead of turning it off I can reduce the config frequency to daily. but Are u saying if i turned off the config service, resource inventory in WD will be zero ?

8000 resources discovered by the WD tool, but I will generate the diagram only for certain resources, like if I select VPC all the VPC dependent resources will be captured in the generated diagram right ? because I dont want to further filter exact resource type and add it into the diagram.

I will select only "VPC" as resource type and all the VPC dependent like subnets, rout table, rt association, nat gateway, customer gateway, vpn eevrything will be added in the diagram, then after saving the diagram I will try to apply filter to remvoe certain resources like tag, rt association, eni's etc. (I am taking about applying filter in diagram settings before exporting into drawio)

svozza commented 9 months ago

Are u saying if i turned off the config service, resource inventory in WD will be zero ?

It depends on what you mean by turn off. If you remove Config, i.e., delete the recorder and the delivery channel then Workload Discovery will not be able to work. I'm not really sure how this data retention thing in Config you mentioned works but I presume it means you would have the resources available in WD for 30 days and then after that they'd be gone.

I will select only "VPC" as resource type and all the VPC dependent like subnets, rout table, rt association, nat gateway, customer gateway, vpn eevrything will be added in the diagram, then after saving the diagram I will try to apply filter to remvoe certain resources like tag, rt association, eni's etc. (I am taking about applying filter in diagram settings before exporting into drawio)

Yeah, the filters a designed exactly for that purpose. I normally have tags filtered out on my diagrams too.

mc-gokulpalani commented 9 months ago

Hi, I am not getting edges between AWS resources for any of the generated diagram. can u help me. below are the screenshot. image

Earlier i used to get the edges between the resources and below are the screenshot. image

svozza commented 9 months ago

Can you confirm that the Hide edges toggle hasn't been enabled in Diagram Settings?

Screenshot 2024-02-12 at 13 01 35
gokul-industrility commented 9 months ago

yes, it hasn't enabled

gokul-industrility commented 9 months ago

Hi, I would like to share a feedback, I have successfully generating the diagram, I have shared it with customer they didn't like it, even though I mentioned that this tool can produce high level diagram but they are not ready to trade off this diagram for the cost they paid for this workload discovery and config is more than $500 in a week.

So even I faced so many issues and raised an issue here to fox the problem, i would really appreaciate your quick support here, but still we didn't like this tool due to cost and lack of other features, if you see other tools in the market they just use read only access IAM role in the account to onboard it into their platform and help us generate the diagram and that is better than this WD generated diagram, only diffrence I see is this discovers all resources in the account.

svozza commented 9 months ago

Thanks for the feedback, I'm sorry you won't be using the tool going forward. I'd also like to thank you for testing the patch for bug you found, this will help lots of other users now we've verified it works. I'm going to close this issue now.