ark-lang / ark

A compiled systems programming language written in Go using the LLVM framework
https://ark-lang.github.io/
MIT License
677 stars 47 forks source link

String library #675

Open kiljacken opened 8 years ago

kiljacken commented 8 years ago

From @felixangell on January 18, 2016 15:30

String concatenation, destroying strings, substring, etc... also iterating through strings, length, blah blah blah

Copied from original issue: ark-lang/stdlib#3

felixangell commented 8 years ago

What happened to the stdlib repo?

kiljacken commented 8 years ago

Lol, if you were hanging around irc, you would've know that me an @MovingtoMars agreed that it would be less of a hassle to keep the stdlib in the main repo, as to not have to deal with subtrees/-modules.

felixangell commented 8 years ago

@kiljacken Yeah I'm cool with that, been a little bit busy with college currently so I'm not as active :egg:

andrewrk commented 8 years ago

What is the data layout of the string type that Ark users should use?

Ark's standard library makes a call to fopen from libc which takes a char *. This requires a pointer with a null terminated byte. Is this compatible with Ark's string type? If not, what is the plan for conversion?

kiljacken commented 8 years ago

Arks standard string is a non-null terminated []u8, which is just length and data, so we have to do conversion and adding a null terminator. I believe that @0xbadb002 missed this when he implemented the file stuff.

andrewrk commented 8 years ago

where do you allocate the memory to do the conversion?

kiljacken commented 8 years ago

We don't, currently, but I'd like functions for automatically allocating on the heap and doing the conversion, and a function for doing the conversion in user passed memory

On Mon, 29 Feb 2016 01:07 Andrew Kelley, notifications@github.com wrote:

where do you allocate the memory to do the conversion?

— Reply to this email directly or view it on GitHub https://github.com/ark-lang/ark/issues/675#issuecomment-189974695.

felixangell commented 8 years ago

@kiljacken Woops :new_moon_with_face:

kiljacken commented 8 years ago

In relation to my earlier comment on conversion between ark and c string, here is the imagined signatures of the functions we'd want:

// Converts a ark string to a zero-terminated string, in user passed memory
func ToZeroTerminated(src: string, dst: ^u8);

// Wrapper around ToZeroTerminated that allocates memory using mem
func AllocZeroTerminated(src: string) -> ^u8;

// Creates an ark string from a zero-terminated string, using the same backing memory
func WrapZeroTerminated(src: ^u8) -> string;

// Creates an ark string from a zero-terminated string, copying to user passed memory
func FromZeroTerminated(src: ^u8, dst: ^u8) -> string;

// Wrapper around FromZeroTerminated that allocates memory via mem
func AllocFromZeroTerminated(src: ^u8) -> string;

I imagine that these would possibly be (static) methods on string, which would depend on #716. Naming is very much open to debate.

felixangell commented 8 years ago

LGTM

On a (semi-related) note, if we were to approve #716, would it be weird to have an alias for a byte pointer that is CString? We could implement these functions on top of CString, and the string type. I think this might be one of the cases where #716 would make things a lot cleaner in this case.

kiljacken commented 8 years ago

That would certainly be nice, from both usability and readability standpoints.

danharbor95 commented 8 years ago

would there be a way to seperate the heap and stack allocation of the strings. you sometimes pass in the data and you have another one that does the allocation for you, its a lot like c

kiljacken commented 8 years ago

@danharbor95 The idea would be that we'd have some way to allocate raw memory on the stack, and then you can use the versions where you pass in a destination pointer.

danharbor95 commented 8 years ago

so like mem alloc but for the stack?

kiljacken commented 8 years ago

@danharbor95 Yes, that's the idea.

MovingtoMars commented 8 years ago

We don't need both WrapZeroTerminated and FromZeroTerminated. Just using memcpy then WrapZeroTerminated is equivelant to FromZeroTerminated.

kiljacken commented 8 years ago

@MovingtoMars Well, the idea here is ease of use. We might as well provide a function (FromZeroTerminated) that does the strlen and memcpy for the user.

MovingtoMars commented 8 years ago

I think it's complicating the API for too little gain.

kiljacken commented 8 years ago

@MovingtoMars Well, we'll just have to agree to disagree on this then. I do not think it thing it complicates the API any amount worth worrying about, an thus I believe that the usability gain far outweighs any concern about complexity in this case..

MovingtoMars commented 8 years ago

I'd be happier to include it if the names were better. Those names aren't distinct enough; it isn't clear what the differences are.

kiljacken commented 8 years ago

@MovingtoMars I already acknowledged that the names were far from final in the original suggestion. They were simply there to get my ideas across.

Naming is very much open to debate.

You're free to help come up with better names.

felixangell commented 8 years ago

Gonna have to say I agree with @kiljacken here on this one, the naming isn't pretty but I'm sure things will be cleaner. Especially if we approve #716 :wink: