cplusplus / CWG

Core Working Group
23 stars 7 forks source link

[expr.new] p16 should give more clear requirement #380

Open xmh0511 opened 1 year ago

xmh0511 commented 1 year ago

Full name of submitter (unless configured in github; will be published with the issue): Jim X

[expr.new] p16 says:

For arrays of char, unsigned char, and std​::​byte, the difference between the result of the new-expression and the address returned by the allocation function shall be an integral multiple of the strictest fundamental alignment requirement of any object type whose size is no greater than the size of the array being created.

So, what the integral value could be? Is it can be any integral? Obviously, the integral cannot be an arbitrary value. As implied by [expr.new.note] p9

[Note 9: Because allocation functions are assumed to return pointers to storage that is appropriately aligned for objects of any type with fundamental alignment, this constraint on array allocation overhead permits the common idiom of allocating character arrays into which objects of other types will later be placed. — end note]

The offsets of an integral multiple of the strictest fundamental alignment should make the result of the new-expression satisfies the alignment requirement imposed by the object type.

Suggested Resolution

For arrays of char, unsigned char, and std​::​byte, the difference between the result of the new-expression and the address returned by the allocation function shall be an integral multiple of the strictest fundamental alignment requirement of any object type whose size is no greater than the size of the array being created, such that the result of the new-expression is satisfied by the alignment requirement impose by that object type.


Incidentally, can the integral multiple of the strictest fundamental alignment really make the result satisfy the requirement? Consider a hypothetical example:

auto p = new unsigned char[10];
new (p) int{0};

Assume the allocation function returns the address with the value 0x1(because this address can satisfy alignment of unsinged char when the alignment is 1) and the alignment imposed by int is 4. There exists no integral multiple of 4 that make the result address satisfies by 4.

frederick-vs-ja commented 1 year ago

It seems that the integral value is 0 on all known implementations... Do we really want to allow non-zero offsets?

The concerns are related to [basic.stc.dynamic.allocation] p3.2. I think we should harmonize the use of strictest fundamental alignment and new-extended alignment, by specifying that the strictest fundamental alignment is not a new-extended alignment.

xmh0511 commented 1 year ago

The concerns are related to [basic.stc.dynamic.allocation] p3.2.

Ah right, I missed that rule, so I removed the additional part.

jensmaurer commented 1 year ago

I'm not seeing a problem here.

The quoted rule allows implementations to put an array cookie at the start of an array allocation (this is needed for non-trivial types to run destructors when deleting the array), and constrains implementations that the result of the new-expression must still be sufficiently aligned for objects larger than char, given that the allocation function is already so constrained in [basic.stc.dynamic.allocation] p3.2.

An implementation can choose that integral value as it sees fit; obviously, it needs to increase the size of the allocation accordingly.

Is your suggested "such that" amendment supposed to be explanatory (which is normatively superfluous)? If not, what additional requirement would it impose that isn't already implied by things said elsewhere?

frederick-vs-ja commented 1 year ago

(this is needed for non-trivial types to run destructors when deleting the array)

This is covered by the previous sentence:

That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array and the allocation function is not a non-allocating form ([new.delete.placement]).

But char, unsigned char, and std​::​byte are trivially destructible, so it doesn't seem needed to allow array cookie for them.

xmh0511 commented 1 year ago

The quoted rule allows implementations to put an array cookie at the start of an array allocation (this is needed for non-trivial types to run destructors when deleting the array)

I think we should impose a similar restriction on the adjustment for new T[N] where T are other types other than char, unsigned char, and std::byte. For example:

#include <iostream>
void* operator new[](std::size_t N){
    auto p = malloc(N);
    std::cout<<p<<std::endl;
    return p;
}
void operator delete[](void* ptr) noexcept{
     std::cout<< ptr<<std::endl;
     free(ptr);
}
struct A{
    virtual void show(){}
    ~A(){}
};
int main(){
    A* table = new A[5];
    std::cout<< "new result "<<table<<std::endl;
    delete [] table;
}

In the current wording, we do not impose the requirement for the address returned by allocation function and that of the new-expression.

jensmaurer commented 1 year ago

But char, unsigned char, and std​::​byte are trivially destructible, so it doesn't seem needed to allow array cookie for them.

Right, but the rules of the language still allow it (maybe for uniform treatment). Not a defect, as far as I can see. Feel free to write a paper with some implementation analysis why we don't need that allowance (anymore).

I think we should impose a similar restriction on the adjustment for new T[N] where T are other types other than char, unsigned char, and std::byte.

We already do. The allocation function has to return storage that is suitably aligned; see [basic.stc.dynamic.allocation] p3.2. Since new actually creates an object of the indicated type (i.e. of the array type in your example), that array must be suitably aligned for the given T. It's up to the implementation of the new-expression in the compiler to make this work. The constraint on the allocation function's return value exists so that the compiler actually has a chance to make it work.

The special case for unsigned char (etc.) is for putting objects larger than char into the array (and use the array to provide storage). This case needs extra rules.

xmh0511 commented 1 year ago

The constraint on the allocation function's return value exists so that the compiler actually has a chance to make it work.

The returned address of the allocation function does indeed satisfy the alignment requirement of any object type, however, it does not mean, the implementation will choose an address that offsets the positive number of bytes from the returned address since we didn't impose the requirement on what the address the implementation can choose for the case in https://github.com/cplusplus/CWG/issues/380#issuecomment-1655200526.

jensmaurer commented 1 year ago

We already have that requirement.

The allocation function is required to return storage suitably aligned for an "A" (among other objects) due to the rule "if the allocation function is named operator new[], the storage is aligned for any object that does not have new-extended alignment (6.7.6) and is no larger than the requested size"

The new-expression returns a pointer to the created array, whose elements must be suitably aligned (otherwise they can't exist).