rjatkins / owaspantisamy

Automatically exported from code.google.com/p/owaspantisamy
0 stars 0 forks source link

Remove hard dependency on xercesImpl #102

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
It appears the AntiSamyDOMScanner has a hard dependency on XercesImpl 
(org.apache.xerces,xml).  With modern JVMs managing XML dependencies is a 
tricky business.  By having a hard dependency on Xerces you are potentially 
making it harder for applications to consume your library.  Here are some of 
the problems you may run into:

* Heap and classpath Bloat: Java 5+ already includes Xerces under the package 
com.sun.org.apache.xerces.  Xerces is a 1.2MB dependency.  When using antisamy 
with the Oracle JRE you are potentially loading several classes that duplicate 
behavior already in the JRE.

* IBM JDK incompatibility: The IBM JDK 5+ includes Xerces under the original 
package org.apache.xerces.  Having XercesImpl on the classpath won't 
necissarily cause additional code bloat but could cause several classloading 
problems since there may be multiple versions of the same class loaded in 
different classloaders.  This can cause difficult to identify and debug 
problems when deploying an application in app servers like Websphere.  
Consumers of antisamy using the IBM JDK should exclude XercesImpl.

I think it would generally make antisamy easier to consume if antisamy could be 
refactored to rely upon the standard xml-apis.  Where antisamy relies upon non 
standard utilities in xerces such as HtmlSerializers I think we should include 
these classes in antisamy under the antisamy namespace.  I would be happy to 
supply a patch if the project is agreeable to the concept.

Mike

Original issue reported on code.google.com by you...@gmail.com on 10 Mar 2011 at 8:26

GoogleCodeExporter commented 9 years ago
Actually this might be a bigger job than I originally thoughts since neko 
depends upon xercesImpl as well.  :(  It looks like Neko is much more tied to 
xerces than antisamy so this issue may not be reasonably fixable unless there 
are an alternative to neko available.

Original comment by you...@gmail.com on 10 Mar 2011 at 8:48

GoogleCodeExporter commented 9 years ago
Actually looking at it again.  Neko provides the option of using a 
xercesMinimal.jar that perhaps could be used.  Though it would require the 
creation of a "Scanner" that uses XNI directly.  It appears the SAX Scanner is 
pretty close to being there so it could be possible???  If there is interest in 
this I could take a stab at it.

Mike

Original comment by you...@gmail.com on 10 Mar 2011 at 9:28

GoogleCodeExporter commented 9 years ago
A stab at this would be much appreciated - I've hated the dependency since day 
one.

Original comment by arshan.d...@gmail.com on 22 Mar 2011 at 12:17

GoogleCodeExporter commented 9 years ago
Feel free to re-open if you get a patch together.

Original comment by arshan.d...@gmail.com on 15 Sep 2011 at 8:20