pixelb / scripts

scripts from pixelbeat.org
http://www.pixelbeat.org/scripts/
777 stars 187 forks source link

ansi2html.sh dies on specific input. #17

Open John-Schlick opened 10 years ago

John-Schlick commented 10 years ago

Given the git diff file below, ansi2html (the newest version pulled straight from the repo) will generate output that stops midway thru the file. Given that the "á" shows up in the output as a "�" I suspect that this is somehow matters.

The HTML stops on this line: informe. Una agencia investigadora de informes de crédito deber�°

========= Input file ======================= diff --git a/classes/screening/report/ui.class.php b/classes/screening/report/ui.class.php index 3476082..7fbb43d 100755 --- a/classes/screening/report/ui.class.php +++ b/classes/screening/report/ui.class.php @@ -319,6 +319,14 @@ class screening_Report_UI } break;  + // Decide what version of the California disclaimer to use on pur printer friendly forms. + case REPORT_COMPONENT_PRINTER_FRIENDLY_HEADER: + $eligReleaseV2 = Params::get('screening', 'report_component_printer_friendly_header.release.v2'); + if ($eligReleaseV2 <= $createTime) { + $renderVersion = 2; + } + break; + case REPORT_COMPONENT_PHYSICAL_CRIMINAL_SEX_OFFENDER: case REPORT_COMPONENT_TALENTSHIELD_PHYSICAL_CRIMINAL_SEX_OFFENDER: case REPORT_COMPONENT_PHYSICAL_CRIMINAL_NATIONWIDE_DB: diff --git a/core/inc/defines.inc.php b/core/inc/defines.inc.php index a703c53..5082859 100755 --- a/core/inc/defines.inc.php +++ b/core/inc/defines.inc.php @@ -195,8 +195,9 @@ define ("REPORTYPE_CONSUMERDMV", "57"); //Consumer Driving Report // NOTE::::::::::::::: PLEASE DO NOT ADD ANY REPORT TYPE WITHOUT TALKING TO NIRAJ  /****/ -//Defines the possible ReportComponents -//DO NOT CHANGE THE DEFINE NUMBERS ONCE THEY HAVE BEEN USED IN PRODUCTION +// Defines the possible ReportComponents +// DO NOT CHANGE THE DEFINE NUMBERS ONCE THEY HAVE BEEN USED IN PRODUCTION +// Kept in Intelius.Packages.ReportComponents (| separated list, search for LIKE %value%) define ("REPORT_COMPONENT_NONE", "0"); define ("REPORT_COMPONENT_SUMMARY", "1"); define ("REPORT_COMPONENT_PROPERTY", "7"); @@ -287,6 +288,7 @@ define ("REPORT_COMPONENT_PHYSICAL_CRIMINAL_1_STATE", "96"); // Like 110 (natcri define ("REPORT_COMPONENT_ISERVICES_SEX_OFFENDER", "97"); define ("REPORT_COMPONENT_PHYSICAL_CIVILCOUNTY", "98"); define ("REPORT_COMPONENT_SOCIAL_NET_SUMMARY", "99"); +define ("REPORT_COMPONENT_PRINTER_FRIENDLY_HEADER", "100"); define ("REPORT_COMPONENT_PHYSICAL_PHYSICAL_EXAM", "101"); // Physical Physical -- I meant to do that define ("REPORT_COMPONENT_PHYSICAL_ESCREEN_DRUG_SCREEN", "102"); define ("REPORT_COMPONENT_INCART_ACCEPTANCE_MARKETING", "103"); diff --git a/core/inc/uberreport.inc.php b/core/inc/uberreport.inc.php old mode 100644 new mode 100755 index 36f768f..202d377 --- a/core/inc/uberreport.inc.php +++ b/core/inc/uberreport.inc.php @@ -173,6 +173,9 @@ class UberReport public function DisplayUberReport($Owner, $GetUserObject, $applicantId, $isPrinterFriendlyPage = 0, $echoOut = TRUE) { global $SiteConfigCore; + + // guarantee the define of the theme class. + require_once('inc/template.inc.php'); theme::factory()->addDir('screening/tpl'); // In case we don't have it yet  // ReportContext supersedes isPrinterFriendlyPage @@ -971,18 +974,57 @@ class UberReport $includeFCRA = false; }  - $civilCode = ''; - if ($includeFCRA){ - // for internation uberform, we do not show Civil code nor FCRA - $civilCode = 'Per California Civil Code 1786, '; + // figure out what render version to use. + $FakeReqProfile = array( + 'App' => REPORT_COMPONENT_PRINTER_FRIENDLY_HEADER, + 'CreateTime' => $this->m_CreateTime, + ); + $FakeReportData = array(); + $renderVersion = screening_Report_UI::getRenderVersion($FakeReqProfile, $FakeReportData, $this->m_ReportContext); + + if ($renderVersion === 1) { + $civilCode = ''; + if ($includeFCRA){ + // for internation uberform, we do not show Civil code nor FCRA + $civilCode = 'Per California Civil Code 1786, '; + } + + // 12pt font is a legal requirement that presumably applies to Web as well as print + $legalTopHtml = '

'.$civilCode.$this->m_SiteName.' does not + guarantee the accuracy or truthfulness of the information in this report as to the + person who is the subject of the investigation, only that the information is accurately copied from + public records. Information generated as a result of identity theft, including evidence of + criminal activity, may be inaccurately associated with the person who is the subject of the report.
'."\r\n"; + } else { + // New and Improved, with a fresh spring scent! + // 12pt font is a legal requirement that presumably applies to Web as well as print + $legalTopHtml = '
 + California Applicants/Employees Only: The report does not guarantee the + accuracy or truthfulness of the information as to the subject of the + investigation, but only that it is accurately copied from public records, + and information generated as a result of identity theft, including + evidence of criminal activity, may be inaccurately associated with the + consumer who is the subject of the report. An investigative consumer + reporting agency shall provide a consumer seeking to obtain a copy of a + report or making a request to review a file, a written notice in simple, + plain English and Spanish setting forth the terms and conditions of his + or her right to receive all disclosures, as provided in Section + 1786.26.
 +
 + Sólo para los Solicitantes/Empleados de California: En el informe no se + garantiza la exactitud o veracidad de la información en cuanto al tema + de la investigación, sino sólo que se ha copiado exactamente de los + registros públicos, y la información generada como resultado del robo + de identidad, incluyendo las pruebas de una actividad delictiva, podría + estar incorrectamente asociada con el consumidor que sea el sujeto del + informe. Una agencia investigadora de informes de crédito deberá + suministrarle a un consumidor que trate de obtener una copia de un + informe o solicite revisar un archivo una notificación por escrito en + inglés y español lisos y llanos, en la que se establezcan los términos + y las condiciones de su derecho a recibir toda la información, como se + dispone en la Sección 1786.26. +
'."\r\n"; } - - // 12pt font is a legal requirement that presumably applies to Web as well as print - $legalTopHtml = '
'.$civilCode.$this->m_SiteName.' does not - guarantee the accuracy or truthfulness of the information in this report as to the - person who is the subject of the investigation, only that the information is accurately copied from - public records. Information generated as a result of identity theft, including evidence of - criminal activity, may be inaccurately associated with the person who is the subject of the report.
'."\r\n"; }  $legalBottomHtml = ''; diff --git a/tests/unit/DataProvider/UserProvider.php b/tests/unit/DataProvider/UserProvider.php new file mode 100755 index 0000000..80cf4d2 --- /dev/null +++ b/tests/unit/DataProvider/UserProvider.php @@ -0,0 +1,34 @@ +<?php +/__ + * The UserProvider class is to provide User related things. + * In the future we envision it being expanded to being able to generate user related things. +
/ +class UserProvider +{ + / + * Log in a user based on their email and password. +  + * @param $email + * @param $password + / + public function loginUser($email, $password) + { + global $User; + $User = array(); + UserLogin($User, $email, $password, SITE_CONFIG_TALENTWISE); + } + + / + * Get a valid user based on their id. +  + * @param integer $userId - the id of the user to look up. + / + public function getUser($userId) + { + $data = GetUserInformation($userId); + + return $data[0]; + } + +} +?> \ No newline at end of file diff --git a/tests/unit/core/inc/UberReportTest.php b/tests/unit/core/inc/UberReportTest.php new file mode 100755 index 0000000..44da4b2 --- /dev/null +++ b/tests/unit/core/inc/UberReportTest.php @@ -0,0 +1,49 @@ +<?php +/__ + * Unit tests for uberreport.inc.php +  + * @author jschlick@TalentWise.com + * @package UnitTests + / +class test_UberReport extends TalentWise_FrameworkTestCase +{ + protected $uberFormProvider; + + protected $uberReport; + + / + * Get ourselves an UberReport that we can use for our calls. + / + protected function setUp() + { + $uberReport = new UberReport(); + $this->uberReport = $uberReport; + } + + + // This is the weakest unit test ever. + // We call it, and make sure the report has the header we coded for. + public function testGetUberFormUser() + { + $userId = 16520012; + $userProvider = new UserProvider(); + $Owner = $userProvider->getUser($userId); + $GetUserObject = array($userId); + $applicantId = 60924957; + // This is critical to our test. It MUSt be a printer friendly page to have the header. + $isPrinterFriendlyPage = 1; + $echoOut = false; + + $this->uberReport->DisplayUberReport($Owner, $GetUserObject, $applicantId, $isPrinterFriendlyPage, $echoOut); + + // Lets see what damage it's wrought. + $className = "UberReport"; + $propertyName = "m_ReportHtml"; + $object = $this->uberReport; + $m_ReportHtml = $this->getPrivateProperty($className, $propertyName, $object); + + // ALL we changed is that printed reports, done AFTER the params date should have the header section. + $this->assertTrue(strstr($m_ReportHtml, '1786.26') !== true, "Report does NOT have the correct 1786.26 header."); + } +} +?> \ No newline at end of file

pixelb commented 10 years ago

Could you attach the diff in an email to P@draigBrady.com.

Does the original version of the script before the recent awk change work any better? https://raw.githubusercontent.com/pixelb/scripts/bd2aabd/scripts/ansi2html.sh

John-Schlick commented 10 years ago

Good question. The reason I came here to get the newest version is that I was using the older version, and thought I'd try the new one. So...

Nope, it also dies with this character.

I'll attach the diff --color (which is a .txt file) to an email to you, as well as the output html that is cut short.

pixelb commented 10 years ago

Seems locale related. I can get new or old script to misbehave when I give it your UTF8 input, but with the locale variables not set. Though I can't get output truncated like you do. I presume there is an error message on stderr when this truncation happens? Are you running the script with a weird environment? If not what is the output from the command: locale

John-Schlick commented 10 years ago

I'm just running it on qa linux box, I don't >>think<< that anything is weird about it.

jschlick@dvm-jschlick2:/usr/local/html(BGS-1516)$ locale LANG=en_US LANGUAGE= LC_CTYPE="en_US" LC_NUMERIC="en_US" LC_TIME="en_US" LC_COLLATE="en_US" LC_MONETARY="en_US" LC_MESSAGES="en_US" LC_PAPER="en_US" LC_NAME="en_US" LC_ADDRESS="en_US" LC_TELEPHONE="en_US" LC_MEASUREMENT="en_US" LC_IDENTIFICATION="en_US" LC_ALL=

(I've never used this command before, so I just blindly typed it, and this is the output.)

pixelb commented 10 years ago

You're in an is0-8859-1 locale. It's unusual to not be in a UTF8 locale these days. A probable workaround would be to set a UTF8 locale first like:

git diff | (export LC_ALL=en_US.utf8; ansi2html.sh) > blah.html

I'll work on making it more locale agnostic

John-Schlick commented 10 years ago

Your workaround works for this case. thanks for figuring it out, I'd have probably never gotten there (since I don't actually know what that export does...)

If you do make this more locale agnostic, please let me know, and I'll happily take the new version and apply it to the entire company here.

pixelb commented 9 years ago

The latest version is now more locale agnostic. Could you try it out? I've not marked this bug as fixed though as I wasn't able to recreate your output truncation issue