InternetHealthReport / internet-yellow-pages

A knowledge graph for the Internet
https://iyp.iijlab.net
GNU General Public License v3.0
43 stars 18 forks source link

solved #83 #120

Closed MAVRICK-1 closed 9 months ago

MAVRICK-1 commented 9 months ago

Description

83 solved

Added new function to add relationship property

@m-appel can you review my PR :-)

MAVRICK-1 commented 9 months ago

@m-appel passed all checks can you review PR :-)

MAVRICK-1 commented 9 months ago

hi @m-appel , In the add_relationship_properties function ,I made a mistake in Cypher Query Anways I have made some changes and used the existing function like you said , but I want some review before commiting . I am sharing my code snippet could you review it once

URL = 'https://api.rovista.netsecurelab.org/rovista/api/overview'
ORG = 'RoVista'
NAME = 'rovista.validating_rov'

class Crawler(BaseCrawler):

    def run(self):
        """Get RoVista data from their API."""
        batch_size = 1000  # Adjust batch size as needed
        offset = 0
        entries = []
        asns = set()

        while True:
            # Make a request with the current offset
            response = requests.get(URL, params={'offset': offset, 'count': batch_size})
            if response.status_code != 200:
                raise RequestStatusError('Error while fetching RoVista data')

            data = response.json().get('data', [])
            for entry in data:
                asns.add(entry['asn'])
                if entry['ratio'] > 0.5:
                    entries.append({'asn':entry['asn'],'ratio':entry['ratio'],'label':'Validating RPKI ROV'})
                else:
                    entries.append({'asn':entry['asn'],'ratio':entry['ratio'],'label':'Not Validating RPKI ROV'})

            # Move to the next page
            offset += batch_size
            # Break the loop if there's no more data
            if len(data) < batch_size:
                break
        logging.info('Pushing nodes to neo4j...\n')
        # get ASNs and Tag IDs
        self.asn_id = self.iyp.batch_get_nodes_by_single_prop('AS', 'asn', asns)
        tag_id_not_vali= self.iyp.get_node('Tag','{label:"Not Validating RPKI ROV"}',create=False)
        tag_id_vali=self.iyp.get_node('Tag','{label:"Validating RPKI ROV"}',create=False)
        # Compute links
        links = []
        for entry in entries:
            asn_qid = self.asn_id[entry['asn']]
            if entry['ratio'] > 0.5:
                links.append({'src_id': asn_qid, 'dst_id':tag_id_vali , 'props': [self.reference, entry]})
            else :
                links.append({'src_id': asn_qid, 'dst_id':tag_id_not_vali , 'props': [self.reference, entry]})

        logging.info('Pushing links to neo4j...\n')
        # Push all links to IYP
        self.iyp.batch_add_links('CATEGORIZED', links)

Is it correct ?

m-appel commented 9 months ago

Yes this looks better, but you can just commit, then I can use the GitHub interface to give easier feedback. (I will also squash all commits of the PR into one, so no worries about polluting the tree or something)

The properties for get_node should be an actual dict, not a string representing a dict. And you should set create=True (or rather delete create=False since it is the default), because the Not Validating RPKI ROV Tag node does not exist.

I also understood now why the API is weird, the offset parameter is actually not an offset, but a page parameter, i.e., it should be incremented by 1 instead of batch_size.

Btw. for future pull requests, please provide a better name and description. I believe there is also a template displayed when you create a new PR; please do not just delete everything, there are some useful checks to be aware of like "How did you test your code" and "Did you update the documentation".

If you run and test your code, you can be more confident if it is correct or not :) Feel free to ask questions when you are stuck, but please do not submit a PR with code that simply does not execute and ask if it is ok.

MAVRICK-1 commented 9 months ago

Thanx @m-appel for the guidance and Feedback . I m doing the changes as you request :-)

MAVRICK-1 commented 9 months ago

@m-appel Modified the code and checked it still I am getting this precommit error