pieroxy / lz-string

LZ-based compression algorithm for JavaScript
MIT License
4.12k stars 568 forks source link

1-to-1 Conversion of LZ-String to Typescript #174

Closed karnthis closed 1 year ago

karnthis commented 1 year ago

Work in progress, creating draft for discussion and feedback during work.

Completed:

Todo:

karnthis commented 1 year ago

Lots of excellent points. I totally goofed on the .gitignore. All my templates include it by default, and I forgot that Vite doesn't include one out of the box. I'll get that sorted.

While normally I would agree with you 100% that that excessive explicit typing is not needed (I don't do it in my working PLZSU project), with the abundant variable reuse and in several cases cross-contamination of types I would feel much more comfortable leaving the typing as-is until this PR is complete, and then stripping them back out with the upgrade. What are your thoughts on this approach?

In the case of result.push(dictionary[3] = w = String(c)) I agree there are ways to improve the source. I think if we are doing a 1:1 we should do it consistently though, which is actually why I think we should stick with x as string for this pass. it leaves no question of changed behavior, and also allows a clear refactor pass later.

I am curious as to your thoughts on interfaces vs straight types. Is there a particular advantage I don't know about or is it personal preference? I do it out of preference, so am open to changing but would love to know the tradeoffs.

I did want to add a contributors section to the project, but was honestly a little uncomfortable being the first one to add myself in such a way.

Rycochet commented 1 year ago

While normally I would agree with you 100% that that excessive explicit typing is not needed (I don't do it in my working PLZSU project), with the abundant variable reuse and in several cases cross-contamination of types I would feel much more comfortable leaving the typing as-is until this PR is complete, and then stripping them back out with the upgrade. What are your thoughts on this approach?

For the explicit typings when implicit works - at some point we'll get eslint on there, and by default that will give a warning that it's not needed - TS itself immediately warns when you don't use something in a way that's valid, so it adds text without adding value ;-)

Definitely nothing to block on!

In the case of result.push(dictionary[3] = w = String(c)) I agree there are ways to improve the source. I think if we are doing a 1:1 we should do it consistently though, which is actually why I think we should stick with x as string for this pass. it leaves no question of changed behavior, and also allows a clear refactor pass later.

My problem with leaving it exactly as it is, is that it's potentially letting something through that could be a bug. We know that should always be a string, but because at one point it might not be there's that ambiguity... I really don't like the thought of setting it as an EntityId type (string | number) - hence the explicit String(val) to force it to be the expected type... @pieroxy are you about to use your expertise on whether that could ever conceivably be anything other than a string? (search for the as string text for the few occasions)

I am curious as to your thoughts on interfaces vs straight types. Is there a particular advantage I don't know about or is it personal preference? I do it out of preference, so am open to changing but would love to know the tradeoffs.

Normally I'd prefer Interfaces, but Types are transparently hidden with code hinting etc - and these aren't exactly interfaces as they're literally just key:value entries, so this makes it a bit more explicit ;-)

I did want to add a contributors section to the project, but was honestly a little uncomfortable being the first one to add myself in such a way.

You'll forever be in the git blame now :-P

karnthis commented 1 year ago

We know that should always be a string, but because at one point it might not be there's that ambiguity... I really don't like the thought of setting it as an EntityId type (string | number) - hence the explicit String(val) to force it to be the expected type... @pieroxy are you about to use your expertise on whether that could ever conceivably be anything other than a string? (search for the as string text for the few occasions)

My problem is not so much that it could be different, but that is is different, for example const dictionary: (number | string)[] = [] in _decompress populates the first three index points with numbers, and the rest is entirely strings. Additionally main.ts line 423 switch (c = bits) is directly comparing what should be a string to a number and later is set to a number on 439 for example.

Rycochet commented 1 year ago

My problem is not so much that it could be different, but that is is different, for example const dictionary: (number | string)[] = [] in _decompress populates the first three index points with numbers, and the rest is entirely strings. Additionally main.ts line 423 switch (c = bits) is directly comparing what should be a string to a number and later is set to a number on 439 for example.

The first one is easy then - const dictionary: [number, number, number, ...string] = []; (since they added rest into tuples), the second probably needs the cast (for now at least) ;-)

karnthis commented 1 year ago

Ok, I completely forgot you can do that lol.

I have linting added, but I am leaving it out from the build process for now since I really want this move to be as explicit as possible for anyone that reads back later trying to understand the transition. Do we want to wait on @pieroxy to weigh in on the casting, or move forward on that? I'm not sure how active they are.

Rycochet commented 1 year ago

I'd like to see @pieroxy give a thumbs up before merging - though maybe put a time limit of 2 weeks or so (basically I'm really busy until then - Friday June 9th to be arbitrary) - as soon as it's updated getting the linting and code-formatting sorted will be awesome!

karnthis commented 1 year ago

const dictionary: [number, number, number, ...string] = []

On further review, if we want to stick to 1:1 we will need to continue using:

const dictionary: (number | string)[] = []

for minimal change, or cast the first 3 to strings like so:

const dictionary: string[] = []
...
for (i = 0; i < 3; i += 1) {
      dictionary[i] = String(i);
    }

Using const dictionary: [number, number, number, ...string[]] = [] would require redoing the existing code as you can''t declare specific elements on an empty array. It would yield something like this:

const dictionary: [number, number, number, ...string[]] = [0, 1, 2]

The above loop would be completely removed instead of modified.

Rycochet commented 1 year ago

Could try it as [] | [number] | [number, number] | [number, number, number] | [number, number, number, ...string] though I'm not sure how TS would react to that? :-P

karnthis commented 1 year ago

that will yield some really wonky type behavior. If we want to go that way our best option is probably number[] | (number | string)[] as it will always be one of the 2, but I don't see much of an advantage from that.

Rycochet commented 1 year ago

that will yield some really wonky type behavior. If we want to go that way our best option is probably number[] | (number | string)[] as it will always be one of the 2, but I don't see much of an advantage from that.

Lets keep it simple via the casting and deal with it later etc :-)

karnthis commented 1 year ago

I did the casting as it seems like our cleanest option, but it will be super easy to remove if decided against.

karnthis commented 1 year ago

Are we concerned at all with benchmarks at this time? Since it is just typing at this point I would be shocked if it yielded a meaningful difference.

Rycochet commented 1 year ago

Are we concerned at all with benchmarks at this time? Since it is just typing at this point I would be shocked if it yielded a meaningful difference.

Honestly I'll want to take a look at the final compiled output before allowing it to be merged - ideally we get a 1:1 library file, then can look at future improvements based on other PRs etc :-)

karnthis commented 1 year ago

Honestly I'll want to take a look at the final compiled output before allowing it to be merged - ideally we get a 1:1 library file, then can look at future improvements based on other PRs etc :-)

Do we want to commit the compiled code folder then? I was expecting we wouldn't and would set up a GitHub action to do the compile and npm publish. In my experience it is very uncommon to commit the dist folder.

Rycochet commented 1 year ago

Honestly I'll want to take a look at the final compiled output before allowing it to be merged - ideally we get a 1:1 library file, then can look at future improvements based on other PRs etc :-)

Do we want to commit the compiled code folder then? I was expecting we wouldn't and would set up a GitHub action to do the compile and npm publish. In my experience it is very uncommon to commit the dist folder.

Given the locked package versions it should be pretty deterministic now - so I can compare locally when building :-)

karnthis commented 1 year ago

Testing port is done, bundling is next.

did base64-string ever work? it has no tests and best I can tell doesn't actually function as-is.

karnthis commented 1 year ago

bundling is done, tests added as sanity check for post-bundle code. I have done everything on my checklist, flagging as ready

Rycochet commented 1 year ago

Just a ping to say I've not forgotten this - currently fighting against the uk heatwave (where AC isn't common, and I come from a far colder climate) - so hopefully get this checked out and merged later in the week! :-)

Rycochet commented 1 year ago

So I've built it locally, run all the tests (passing, with around 95% coverage), and then compared the final output to the old final output (the old lz-string.min.js to the new index.es.js) where they are identical in terms of code (variable names change), but it doesn't have the UMD wrapper - which is included inside index.umd.js, with the correct package.json code to point bundlers at the right file.

So given that we've got a new RC release for it, and everything else being equal - enjoy!!!