Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.98k stars 559 forks source link

Missing XSUB functions for UTF-8 char* buffers #17079

Open p5pRT opened 5 years ago

p5pRT commented 5 years ago

Migrated from rt.perl.org#134262 (status was 'open')

Searchable as RT134262$

p5pRT commented 5 years ago

From @pali

Hi! Currently there are (for x86) XSUB functions/macros which take only Latin1 buffer. E.g. XST_mPV()\, XSRETURN_PV()\, POPpbytex\, PUSHp()\, XPUSHs()\, etc...

Could it be possible to add also UTF8 functions/macros variants? E.g. XST_mPVutf8\, XSRETURN_PVUTF8\, POPputf8x\, PUSHputf8\, ...

It would simply working with UTF-8 char* strings as currently the only way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings correctly\, as Latin1 char* strings can store only U+00 .. U+FF UNICODE codepoints.

p5pRT commented 5 years ago

From @jkeenan

On Thu\, 04 Jul 2019 11​:02​:41 GMT\, pali@​cpan.org wrote​:

Hi! Currently there are (for x86) XSUB functions/macros which take only Latin1 buffer. E.g. XST_mPV()\, XSRETURN_PV()\, POPpbytex\, PUSHp()\, XPUSHs()\, etc...

Could it be possible to add also UTF8 functions/macros variants? E.g. XST_mPVutf8\, XSRETURN_PVUTF8\, POPputf8x\, PUSHputf8\, ...

It would simply working with UTF-8 char* strings as currently the only way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings correctly\, as Latin1 char* strings can store only U+00 .. U+FF UNICODE codepoints.

Karl\, would this be related to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=134142 ?

-- James E Keenan (jkeenan@​cpan.org)

p5pRT commented 5 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 5 years ago

From @khwilliamson

On 7/4/19 7​:00 AM\, James E Keenan via RT wrote​:

On Thu\, 04 Jul 2019 11​:02​:41 GMT\, pali@​cpan.org wrote​:

Hi! Currently there are (for x86) XSUB functions/macros which take only Latin1 buffer. E.g. XST_mPV()\, XSRETURN_PV()\, POPpbytex\, PUSHp()\, XPUSHs()\, etc...

Could it be possible to add also UTF8 functions/macros variants? E.g. XST_mPVutf8\, XSRETURN_PVUTF8\, POPputf8x\, PUSHputf8\, ...

It would simply working with UTF-8 char* strings as currently the only way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings correctly\, as Latin1 char* strings can store only U+00 .. U+FF UNICODE codepoints.

Karl\, would this be related to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=134142 ?

No.

pali\, do you have patches?

p5pRT commented 5 years ago

From @pali

On Saturday 06 July 2019 08​:57​:00 karl williamson via RT wrote​:

On 7/4/19 7​:00 AM\, James E Keenan via RT wrote​:

On Thu\, 04 Jul 2019 11​:02​:41 GMT\, pali@​cpan.org wrote​:

Hi! Currently there are (for x86) XSUB functions/macros which take only Latin1 buffer. E.g. XST_mPV()\, XSRETURN_PV()\, POPpbytex\, PUSHp()\, XPUSHs()\, etc...

Could it be possible to add also UTF8 functions/macros variants? E.g. XST_mPVutf8\, XSRETURN_PVUTF8\, POPputf8x\, PUSHputf8\, ...

It would simply working with UTF-8 char* strings as currently the only way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings correctly\, as Latin1 char* strings can store only U+00 .. U+FF UNICODE codepoints.

Karl\, would this be related to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=134142 ?

No.

pali\, do you have patches?

No\, I have not written anything for this.

p5pRT commented 5 years ago

From @pali

On Saturday 06 July 2019 17​:58​:11 pali@​cpan.org wrote​:

On Saturday 06 July 2019 08​:57​:00 karl williamson via RT wrote​:

On 7/4/19 7​:00 AM\, James E Keenan via RT wrote​:

On Thu\, 04 Jul 2019 11​:02​:41 GMT\, pali@​cpan.org wrote​:

Hi! Currently there are (for x86) XSUB functions/macros which take only Latin1 buffer. E.g. XST_mPV()\, XSRETURN_PV()\, POPpbytex\, PUSHp()\, XPUSHs()\, etc...

Could it be possible to add also UTF8 functions/macros variants? E.g. XST_mPVutf8\, XSRETURN_PVUTF8\, POPputf8x\, PUSHputf8\, ...

It would simply working with UTF-8 char* strings as currently the only way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings correctly\, as Latin1 char* strings can store only U+00 .. U+FF UNICODE codepoints.

Karl\, would this be related to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=134142 ?

No.

pali\, do you have patches?

No\, I have not written anything for this.

So should I prepare some of them?

p5pRT commented 5 years ago

From @khwilliamson

Sure

Sent from my iPhone

On Oct 14\, 2019\, at 1​:03 PM\, pali@​cpan.org wrote​:

On Saturday 06 July 2019 17​:58​:11 pali@​cpan.org wrote​:

On Saturday 06 July 2019 08​:57​:00 karl williamson via RT wrote​: On 7/4/19 7​:00 AM\, James E Keenan via RT wrote​:

On Thu\, 04 Jul 2019 11​:02​:41 GMT\, pali@​cpan.org wrote​:

Hi! Currently there are (for x86) XSUB functions/macros which take only Latin1 buffer. E.g. XST_mPV()\, XSRETURN_PV()\, POPpbytex\, PUSHp()\, XPUSHs()\, etc...

Could it be possible to add also UTF8 functions/macros variants? E.g. XST_mPVutf8\, XSRETURN_PVUTF8\, POPputf8x\, PUSHputf8\, ...

It would simply working with UTF-8 char* strings as currently the only way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings correctly\, as Latin1 char* strings can store only U+00 .. U+FF UNICODE codepoints.

Karl\, would this be related to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=134142 ?

No.

pali\, do you have patches?

No\, I have not written anything for this.

So should I prepare some of them?

toddr commented 5 years ago

@pali we take pull requests now

khwilliamson commented 2 years ago

@pali Patches welcome