Fznamznon commented 4 months ago

I merged https://github.com/llvm/llvm-project/pull/68620 this morning. However there are several places to improve the implementation. This issue is to track them as well as for further discussion on the matter. To be done:

[ ] Look close into serialization code for EmbedExpr. Right now the storage will be duplicated if two EmbedExpr referencing same data are serialized. Context https://github.com/llvm/llvm-project/pull/68620#discussion_r1625299848
[x] Transform tok::annot_embed into a bunch of tokens and inject them back into stream. We might need a new kind of token that directly hold a numerical value. Context https://github.com/llvm/llvm-project/pull/68620#pullrequestreview-2085206518
[ ] Solve the issue that we're copying the data if a file is page-sized. Ideas https://github.com/llvm/llvm-project/pull/68620#issuecomment-2111257692 .
[ ] Maybe look more into performance of general case. Current performance diff between "fast" and "slow" case available here https://github.com/llvm/llvm-project/pull/68620#issuecomment-2122564803 .

To be discussed

[ ] Decide how to handle -d, -E options
[ ] if _has_include/__has_embed/__has_c_attribute may appear in the limit parameter argument. Right now accepted, we may need to diagnose that.
[ ] What the default preprocessed output format should be, what options to provide to modify it.
[ ] Decide how to spell base64-input.

Please feel free to edit this if I got something wrong.

CC @AaronBallman @jyknight @cor3ntin @jakubjelinek

llvmbot commented 4 months ago

@llvm/issue-subscribers-clang-frontend

Author: Mariya Podchishchaeva (Fznamznon)

I merged https://github.com/llvm/llvm-project/pull/68620 this morning. However there are several places to improve the implementation. This issue is to track them as well as for further discussion on the matter. To be done: - [ ] Look close into serialization code for `EmbedExpr`. Right now the storage will be duplicated if two `EmbedExpr` referencing same data are serialized. Context https://github.com/llvm/llvm-project/pull/68620#discussion_r1625299848 - [ ] Transform `tok::annot_embed` into a bunch of tokens and inject them back into stream. We might need a new kind of token that directly hold a numerical value. Context https://github.com/llvm/llvm-project/pull/68620#pullrequestreview-2085206518 - [ ] Solve the issue that we're copying the data if a file is page-sized. Ideas https://github.com/llvm/llvm-project/pull/68620#issuecomment-2111257692 . - [ ] Maybe look more into performance of general case. Current performance diff between "fast" and "slow" case available here https://github.com/llvm/llvm-project/pull/68620#issuecomment-2122564803 . To be discussed - [ ] Decide how to handle -d<letter>, -E options - [ ] if _has_include/__has_embed/__has_c_attribute may appear in the limit parameter argument. Right now accepted, we may need to diagnose that. - [ ] What the default preprocessed output format should be, what options to provide to modify it. - [ ] Decide how to spell base64-input. Please feel free to edit this if I got something wrong. CC @AaronBallman @jyknight @cor3ntin @jakubjelinek

AaronBallman commented 4 months ago

There are also some more follow-up concerns that were raised on the original PR, starting from https://github.com/llvm/llvm-project/pull/68620#issuecomment-2167525153

Fznamznon commented 4 months ago

I posted a reland https://github.com/llvm/llvm-project/pull/95802 .

R-Goc commented 3 months ago

Is there a plan to implement a feature test macro for this? In the C++ paper, p1967, a feature test macro, __cpp_pp_embed is proposed. Or is there another way to check if embed is present for C++ that I missed, with some has_extension or has_feature macro?

AaronBallman commented 3 months ago

Is there a plan to implement a feature test macro for this? In the C++ paper, p1967, a feature test macro, __cpp_pp_embed is proposed. Or is there another way to check if embed is present for C++ that I missed, with some has_extension or has_feature macro?

In C, the feature test macro is __STDC_VERSION__; the feature has not yet been adopted in C++ and so we're in a bit of a holding pattern for what feature test macro we expose there.

jakubjelinek commented 3 months ago

Why do you need another feature test macro?

ifdef __has_embed

or

ifdef __has_embed

if __has_embed(FILE)

should be enough for both C and C++.

Fznamznon commented 3 months ago

I have created PR https://github.com/llvm/llvm-project/pull/97274 to address the second item in the TODO list (Transform tok::annot_embed into a bunch of tokens and inject them back into stream. We might need a new kind of token that directly hold a numerical value.)

h-vetinari commented 3 months ago

Just noticed that #embed isn't mentioned under in the C23 section of the clang 19 release notes.

AaronBallman commented 3 months ago

Just noticed that #embed isn't mentioned under in the C23 section of the clang 19 release notes.

Good catch, the release note seems to have been dropped.

Fznamznon commented 3 months ago

Just noticed that #embed isn't mentioned under in the C23 section of the clang 19 release notes.

Here it is https://github.com/llvm/llvm-project/pull/97997

Fznamznon commented 2 months ago

Found one more bug via testing against gcc tests https://godbolt.org/z/cffPqhfob Also there is https://github.com/llvm/llvm-project/pull/97274#issuecomment-2228682653 .

Fznamznon commented 2 months ago

embed crash with std::initializer_list https://github.com/llvm/llvm-project/pull/99050#discussion_r1679593276 . The crash itself can be fixed easily, but making clang accept the code seems more complicated.

llvm / llvm-project

[clang] Follow-up on #embed implementation #95222

ifdef __has_embed

ifdef __has_embed

if __has_embed(FILE)