ptp can't parse more than once

DoomTaper commented 8 years ago

After initialising ptp if I run ptp.parse correctly for first time then after that if I run ptp.parse with any arbitrary pathname, it will be a success and will again be parsing taking the pathname initially provided. Steps to reproduce ptp = PTP() or ptp = PTP(<tool_name>)

doing parsing with a valid folder such that ptp.parse is a success vulns = ptp.parse(<pathname>)

Now use ptp.parse with any arbitrary pathname (even invalid) and any number if times without initialising again vulns2 = ptp.parse('asdads') vulns3 = ptp.parse('xcvxvxv')

You will see that it's a success.

Problem: This happens because when we do it correct for the first time self.parser is initialised and later when we again use it since self.parser is already initialised self._init_parser doesn't run and hence self.stream is not updated.

How ptp can be more useful by solving this issue?

If say I have 10 reports of different tools then I will not initialise ptp with any tool and just do ptp.parse 10 times to get ranking of each report.
If I have 10 reports of a single tool say w3af then I will initialise ptp with w3af and will do ptp.parse 10 times.

DoomTaper commented 8 years ago

@DePierre I will solve this too while implementing http parser

DePierre commented 8 years ago

@DoomTaper IMO this needs some discussion to agree on what PTP should do.

I agree with you about the fact that PTP should be smarter and reset the parser when a new report is fed. For instance, I should be able to do:

>>> from ptp import PTP
>>> p = PTP()
>>> p.parse(filename='robots.txt')
[ . . . ]  # PTP has detected robots.txt and chose the correct parser
>>> p.parse(filename='w3af.xml')
[ . . . ]  # PTP has detected w3af tool and updated the correct parser

In that case, the first call to parse() should detect that the report is from robots.txt and the second call to parse() should detect that the report is from w3af.

Now the question is:

Should the list of vulnerabilities be reset between two calls to parse()?

In my opinion, it shouldn't. Meaning that in the example above, the second call to parse() should return the list of vulnerabilities from both robots.txt AND w3af.xml.

Then, if the user wants to keep the list of vulnerabilites separated, like we do in OWTF, the user should instanciate ptp twice, like so:

>>> from ptp import PTP
>>> p = PTP()
>>> p.parse(filename='robots.txt')
[ . . . ]  # List of vulns from robots.txt only
>>> p = PTP()
>>> p.parse(filename='w3af.xml')
[ . . . ]  # List of vulns from w3af only

@DoomTaper what do you think?

7a commented 8 years ago

I think cumulative parsing is in principle unlikely, but I agree it is a cool option to have. What about the following?

p = PTP(cumulative=True) # Accumulates findings from all parsing operations

vs.

p = PTP(cumulative=False) # Each parsing operation auto-resets ptp

Then, regardless of the ptp user intentions, there's no need to re-instantiate all the time, since ptp will know what to do.

This seems more user friendly to me.

Can this work?

DePierre commented 7 years ago

Implemented via 713a8fba1a402203de6ca5c02bd5e14e892ba2c3, thanks @DoomTaper!

owtf / ptp

ptp can't parse more than once #12