u01jmg3 / ics-parser

Parser for iCalendar Events • PHP 8+, 7 (≥ 7.4), 5 (≥ 5.6)
MIT License
450 stars 144 forks source link

Description with HTML Link #304

Closed ucola closed 2 years ago

ucola commented 2 years ago

As first, many thanks for this parser! Works fine and fast!

Description of the Issue:

We have on our description some HTML Code i.e. <a href link from Microsoft Teams. If I check the ics file, it is inside. Possible that I can read out the description with HTML Code?

Steps to Reproduce:

Parse a ics file with html code inside the description tag.

ucola commented 2 years ago

I found out, that this appears in the code below. Any idea how to solve this?

protected function unfold(array $lines)
    {
        $string = implode(PHP_EOL, $lines);
        $string = preg_replace('/' . PHP_EOL . '[ \t]/', '', $string);

        $lines = explode(PHP_EOL, $string);

        return $lines;
    }
u01jmg3 commented 2 years ago

As per the issue template which has not been filled in, for me to do any investigation, I need the offending iCal.

With regards to why the unfold() method exists, please see iCalendar RFC 5545.

ucola commented 2 years ago

Thank you for your feedback, I only can send you the VEVENT. Enough for investigation?

BEGIN:VEVENT
DESCRIPTION:\n_____________________________________________________________
 ___________________\nMicrosoft Teams-Besprechung\nNehmen Sie von Ihrem Com
 puter oder der mobilen App aus teil\nKlicken Sie hier\, um an der Besprech
 ung teilzunehmen<https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZD
 FjOGQ0NzQtMjRhMy00ZjQ2LWJkNzctMWU1NGQ4MTZmMzX0%40thread.v2/0?context=%7b%2
 2Tid%22%3a%2292a81fc9-b823-4ce5-9973-a0fd656b3x07%22%2c%22Oid%22%3a%22ebb4
 6e02-c3e9-48e3-a71a-c7d319f2481b%22%7d>\nWeitere Infos<https://aka.ms/Join
 TeamsMeeting> | Besprechungsoptionen<https://teams.microsoft.com/meetingOp
 tions/?organizerId=eab46e02-c3e9-48e3-a71a-c7d319f2481b&tenantId=92a81fc9-
 b823-4ce5-9973-a0fd657b3e07&threadId=19_meeting_XDFjOGQ0NzQtMjRhMy00ZjQ2LW
 JkNzctMWU1NGQ4MTZmMzY0@thread.v2&messageId=0&language=de-CH>\n____________
 ____________________________________________________________________\n
UID:040000008200E00074C5B7101A82E008000000000751C39A0620D801000000000000000
 0100000000C5B17A43BC90342B36D0A1693316F75
SUMMARY:Dies ist die beschreibung
DTSTART;TZID=W. Europe Standard Time:20220216T110000
DTEND;TZID=W. Europe Standard Time:20220216T113000
CLASS:PUBLIC
PRIORITY:5
DTSTAMP:20220315T215549Z
TRANSP:OPAQUE
STATUS:CONFIRMED
SEQUENCE:1
LOCATION:Strasse\, 9000 Lausanne
X-MICROSOFT-CDO-APPT-SEQUENCE:1
X-MICROSOFT-CDO-BUSYSTATUS:BUSY
X-MICROSOFT-CDO-INTENDEDSTATUS:BUSY
X-MICROSOFT-CDO-ALLDAYEVENT:FALSE
X-MICROSOFT-CDO-IMPORTANCE:1
X-MICROSOFT-CDO-INSTTYPE:0
X-MICROSOFT-DONOTFORWARDMEETING:FALSE
X-MICROSOFT-DISALLOW-COUNTER:FALSE
END:VEVENT
u01jmg3 commented 2 years ago

I'm a little confused. There's no valid HTML in your description. e.g. <https://aka.ms/Join TeamsMeeting> is not an "a href link". Either way, the parser isn't altering the description and the unfold() method is working as expected spitting out what it gets in.

ucola commented 2 years ago

hi @u01jmg3 your right, sorry my fault... its not a valid html link and I check it out right now again. The file function (see bellow) strips out some characters from the URL.

if (($lines = file($filename, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES, $context)) === false) {

The original ICS contains the string below inside the description

teilzunehmen<https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZD
 FjOGQ0NzQtMjRhMy00ZjQ2LWJkNzctMWU1NGQ4MTZmMzX0%40thread.v2/0?context=%7b%2
 2Tid%22%3a%2292a81fc9-b823-4ce5-9973-a0fd656b3x07%22%2c%22Oid%22%3a%22ebb4
 6e02-c3e9-48e3-a71a-c7d319f2481b%22%7d>\nWeitere 

But the $lines contains this string bellow. The question now, why did "files" remove some of the whole URL?

teilzunehmen  .com/l/meetup-join/19%3ameeting_ZD
 FjOGQ0NzQtMjRhMy00ZjQ2LWJkNzctMWU1NGQ4MTZmMzX0%40thread.v2/0?context=%7b%2
 2Tid%22%3a%2292a81fc9-b823-4ce5-9973-a0fd656b3x07%22%2c%22Oid%22%3a%22ebb4
 6e02-c3e9-48e3-a71a-c7d319f2481b%22%7d>\nWeitere Infos

and after this step, the unfold remove the last piece of URL... How you see, the content inside this description is a Microsoft Teams invite. Would be nice, if we can have this description like it is.

u01jmg3 commented 2 years ago

I'm not quite sure what you expect your chosen browser to render when it's invalid HTML.

If I wrap the output of printData() with htmlspecialchars(), the description is as expected.

output
ucola commented 2 years ago

ok all right, thats a good option to solve the issue... So I will use it and create my own html link with this. thank you for the support!