Open agpcardoso opened 6 years ago
To do a bypass, I had to create this function that apply HttpUtility.HtmlEncode only in content ignoring all html and script code
If anyone needs it, here is my code below
Using
string _returnHtmlTreated = this.ApplyHtmlTreatments(@"\Letters\ModelsLetterShipment\", "Model1.htm");
public string ApplyHtmlTreatments(string directoryModeloHtml, string nameFileModeloHtml)
{
string _conteudoArquivoPorLinha = string.Empty;
StringBuilder _retornoConteudoTratado = new StringBuilder();
StringBuilder _todoHtml = new StringBuilder();
//Abre arquivo HTML
//-------------------
using (System.IO.StreamReader file = new System.IO.StreamReader(directoryModeloHtml + nameFileModeloHtml, Encoding.GetEncoding("iso-8859-1")))
{
//Concatena todo o conteudo linha a linha na variavel _todoHTML
//-------------------------------------------------------------
while ((_conteudoArquivoPorLinha = file.ReadLine()) != null)
_todoHtml.Append(_conteudoArquivoPorLinha + " ");
//Atribui 10 espaços antes e depois de cada tag html
//--------------------------------------------------
_todoHtml.Replace("<", " <")
.Replace(">", "> ")
.Replace(" ", " ");
//Transforma em um trecho a cada 10 espaços jogando cada trecho em um array de trechos
//------------------------------------------------------------------------------------
var _trechosArray = _todoHtml.ToString().Split(" ");
//Varre os trechos sendo que trechos de HTML onde será aplicado o encoding SOMENTE para
//trechos que NÃO são HTML
//-------------------------------------------------------------------------------------
foreach (var _trecho in _trechosArray)
{
string _trechoTratado = string.Empty;
//SE _trecho NAO for uma tag HTML trata, caso contrario NÃO trata
if ((Regex.Match(_trecho.Trim(), @"<.*?>", RegexOptions.IgnoreCase).Success == false) && _trecho.Trim() != " ")
_trechoTratado = HttpUtility.HtmlEncode(_trecho.Trim()) + " ";
else
{
//if this part is an img tag I set the complete path including the string file:///
if (_trecho.Trim().IndexOf("<img") >= 0)
_trechoTratado = _trecho.Trim().Replace("src=\"", "src=\"file:///" + @directoryModeloHtml.Replace(@"\",@"/"));
else
_trechoTratado = _trecho.Trim() + " ";
}
_retornoConteudoTratado.Append(_trechoTratado);
}
file.Close();
}
return _retornoConteudoTratado.ToString();
}
Perhaps it is related so I post it here. We have a similar problem when the provided html is not valid xhtml. Like for instance <meta ....> should in xhtml be closed. If I provide the html (without the closing ) the generated PDF is plain text representation of the Html. When I do provide the closing tag a proper PDF is generated
Regardless of configuration the charset not change
1 - below it's my html that I'm trying to convert
2- below it's my code to do the conversion
Controller Code
x.teste() Function Code
3 - below it's my pdf file converted teste_1.pdf