Closed dee-see closed 3 years ago
vita -d comcast.net
results inmemory allocation of 1610612736 bytes failed
I don't know if there's something going wrong with the error message, but that's quite a bit of memory!
My guess is that because comcast.net
returns an absurd amount of results that we're probably actually running out of memory to allocate it.
Flushing results to output as they are fetched would solve that, however it might make it difficult to output only unique results. Personally I wouldn't mind a
--flush
switch that outputs duplicated results.
Yea I think this is a good idea, maybe the solution is to make Runner.run
method to return a stream, and the PostProcessor.clean
method to just return an iterator over the filtered results.
Depending on the cli flag we then either remove the duplicates by collecting the iterator into a HashSet
or just write it to stdout with a BufWriter
. What do you think?
I think that sounds great!
Another thing I noticed while digging into this issue was that I was allocating another large vec for the SonarSearch results coming over grpc.
https://github.com/junnlikestea/vita/blob/3782231ede2da49e98cf915e6c638725c7cabb04/crobat/src/lib.rs#L48-L63
Because the type returned by the line below implements the Stream
trait we could probably just return that and avoid all those extra allocations.
https://github.com/junnlikestea/vita/blob/3782231ede2da49e98cf915e6c638725c7cabb04/crobat/src/lib.rs#L56
So the method would look something like:
pub async fn get_subs(&mut self, host: Arc<String>) -> Result<impl Stream<Item = std::result::Result<Domain, Status>>> {
trace!("querying crobat client for subdomains");
let request = tonic::Request::new(QueryRequest {
query: host.to_string(),
});
debug!("{:?}", &request);
let stream = self.client.get_subdomains(request).await?.into_inner();
Ok(stream)
}
Currently the tool fetches all subdomains and at the end prints them all to stdout. When running again targets with very large amounts of subdomains, this allocates a very large vector for subdomains before calling
cleaner.clean(subdomains)
. This is made worse by the fact that I'm running Vita on a 2 GB VPS which simply can't handle it and crashes.vita -d comcast.net
results inmemory allocation of 1610612736 bytes failed
I don't know if there's something going wrong with the error message, but that's quite a bit of memory!Flushing results to output as they are fetched would solve that, however it might make it difficult to output only unique results. Personally I wouldn't mind a
--flush
switch that outputs duplicated results.