Q: reusing parsed Schema definitions in a multiple WSDL scenarion?

ithena commented 11 years ago

In our use case we will use the soa-mode library as the base utility for comparing our entire services 'build' (containing +-300 WSDL files and +- 1000 XSD files).

The types defined in the XML Schema's are reused throughout the different WSDLs, but currently they need to be reprised for each and every WSDL. We would like to reuse the already parsed Schema's, avoiding reprising all Schema's for each and every WSDL file.

Would it possible to reuse parsed Schema's by somehow adding them to the initial SchemaParserContext when the parsing of the Schema's start? Or would this not have the desired effect?

keshavarzi commented 11 years ago

I see the motivation! It would make sense. At the moment there is a similar constraint for schema imports. You could modify the SchemaParser to first look after the schema in the context and avoid parsing if its available.

The point would be how to recognize the desired schemas in the context! Usage of targetNamespace AND schema location would be recommendable.

keshavarzi commented 11 years ago

Groovy 2.2 has a new feature: @Memoized AST transformation for methods If you could call the SchemaParser.parse() always with a clearly identifiable parameter (e.g. absolute path), it would store the result of previous executions in the cache and avoid reprising the parsed files again.

ithena commented 11 years ago

@Memoized could indeed be a quick win if the parse method is called with an absolute path. Assuming the memory usage characteristics of the memoize cache can be controlled.

In another usage scenario, where we would try to (ab)use the soa-model library for an impact analysis scenario, @Memoized would not do. Here we would like to clone the Schema tree, make one or more changes and then run the compare algorithm for all WSDL definitions that might be impacted. Here it would be much easier to bootstrap the process with an already parsed Schema tree.

The impact analysis scenario would be:

parse the entire set of WSDLs and their linked Schema's
clone and change some specific types
compare the original model with the cloned one

predic8 commented 11 years ago

Hi Ruben, we have used SOA Model for Impact Analysis already and have software that can manage lots of service descriptions each in different versions. Changes are automatically detected and reported. We could talk about this on the phone. I am back in the office by next week.

Thomas

Am 11/21/13 8:46 AM, schrieb Ruben:

@Memoized could indeed be a quick win if the parse method is called with an absolute path. Assuming the memory usage characteristics of the /memoize/ cache can be controlled.

In another usage scenario, where we would try to (ab)use the soa-model library for an impact analysis scenario, @Memoized would not do. Here we would like to clone the Schema tree, make one or more changes and then run the compare algorithm for all WSDL definitions that might be impacted. Here it would be much easier to bootstrap the process with an already parsed Schema tree.

— Reply to this email directly or view it on GitHub https://github.com/membrane/soa-model/issues/168#issuecomment-28963703.

Viele Grüße, Thomas Bayer

predic8 GmbH Moltkestr. 40

53173 Bonn

http://predic8.de Tel: +49 (0) 228 5552576-0 Fax: +49 (0) 228 5552576-50

Amtsgericht Bonn HRB 16152 Geschäftsführer: Thomas Bayer

ithena commented 11 years ago

Thomas, I passed your mail to the person responsible for this project. I am currently only contracted for a short (2 month) project, and do not have any say over budgets or longer term plans.

regards, Ruben

ithena commented 11 years ago

@keshavarzi, Would you, in principle, be willing to merge a solution where I refactor the current WSDLParser and SchemaParser + SchemaParserContext, so that is possible to use specific parser context implementations that handle this scenario?

After the refactoring the WSDLParserContext and SchemaParserContext would decide themselves which key they use for caching (currently namespace). And the WSDLParserContext would also be able to decide which SchemaParserContext to use when descending (default an empty new SchemaParserContext).

The current, and default, parser context implementations would keep exactly the same behavior as they do currently. But when providing a specific (new) parser context implementation when parsing a WSDL or Schema, the schema definitions would be reused hen parsing multiple WSDLs.

I will provide the refactoring as a pull request, so that you can comment on concrete code changes.

keshavarzi commented 11 years ago

@ithena Ruben, we are discussing your problem and will get back to you. Till then would you let us know how long is the time delay, we are talking about? Or which resource ever, that is making it critical. Thanks

ithena commented 11 years ago

@keshavarzi The time for parsing our 300 WSDLs + Schemas (each WSDL has a separate messages XSD, which all reuse the same imported schemas) is +-30 seconds on my recent Macbook pro with 8 parser threads and 500-750MB of heap space is used. But on the organization's less capable machines it takes multiple minutes.

Another reason to only have one parsed instance of a certain Schema in memory, is that it would allow us to implement a 'faux' impact analysis scenario. Where we clone the tree of parsed schemas, make a small change and invoke the compare algorithm.

membrane / soa-model

Q: reusing parsed Schema definitions in a multiple WSDL scenarion? #168