Open gonzalo-bulnes opened 6 years ago
Things that I think you could try is this space are:
Hi @sebglazebrook!
Thank for the pointers! I think the constants are what I was trying to achieve, and thought that my variable being immutable, then they would be constant... I'll search specifically for constants : )
Yep, there are constants in Rust, I forgot that paragraph it seems, but that makes sense to me. (:bulb: thanks @sebglazebrook)
While loading the data from file would allow to swap easily the wordlists, I don't think there is any need for that currently. (I personally like the long worldlist from the EFF.)
Using a struct could be a next step. For now I think constants are going to fulfill the purpose of making the code more readable.
I'll reword this issue now that the way to go is clear. :slightly_smiling_face:
The data of EFF's long wordlist should live in a different file
As it is currently, that long piece of data (7776 numbers -dice throws- and 7776 words) is bloating the dice/wordlist/mod.rs
file.
Github does a pretty good job, only skipping syntactic coloration for those lines, but otherwise editing the file isn't pleasant. Also, the data is unlikely to change while the logic certainly will.
Objective: move data to a different file, in the form of constants so that it doesn't get mixed with the code that is likely to be improved (and change) over time.
Challenge: as explained in the link above, constants require a type annotation... what is their type?
Bonus points: I think that probably the only constant that needs to be public is the eff_long_wordlist
itself, keeping the other ones private would be nice.
This is a refactoring, so there is no need to add new tests, the tests that are already written are sufficient. The goal is to make them green once the move is over :heavy_check_mark:
Of course, I'm happy to discuss alternatives or help at any stage :slightly_smiling_face: I am learning Rust as well!
@gonzalo-bulnes Personally I think that loading the words from a file is the best approach here.
Just my two cents 😄
I think that your considerations raise good questions @Bassetts. Before going into the details, my current thinking is that without discarding it in the future, loading the wordlist from a file is not what I would start with. I'll do my best to lay out what I have in mind :slightly_smiling_face:
It is probably useful to start by stating that by principle I try to do as little as is necessary at once, keeping in mind that in software we can always do more in a week if we think it's useful, and that an iterative approach, in my experience, often leads to find other useful things to do, with a different perspective, once the problem at hand is solved.
Also, at this moment, I am as interested in training myself to building the right thing as in learning Rust (i.e. what building the thing right means in Rust). :slightly_smiling_face: I don't mind doing in 3 small PRs what could be done in one if there was time pressure.
So, this is what I have currently in mind:
I am no expert but I believe a text file has more potential for compression than a binary
I'm no expert either, that statement seems intuitively correct to me... I would also check (gut felling) what happens during the compilation steps. I feel that this question can be an interesting pretext to learn things about Rust internals if that's something we / someone wants to do.
With all that in mind, I'd like to reword / refine the objectives for this step from:
Github does a pretty good job, only skipping syntactic coloration for those lines, but otherwise editing the file isn't pleasant. Also, the data is unlikely to change while the logic certainly will.
Objective: move data to a different file, in the form of constants so that it doesn't get mixed with the code that is likely to be improved (and change) over time.
...to:
Github does a pretty good job, only skipping syntactic coloration for those lines, but otherwise editing the file isn't pleasant. Also, the data is unlikely to change while the logic certainly will.
Objective: move data to a different file, in the form of constants so that it doesn't get mixed with the code that is likely to be improved (and change) over time, so that:
- the
dice/wordlist/mod.rs
is easier to work with (more colors, less scrolling)...- ...without becoming more complex to review (one step at a time)
- it is clearer that the EFF long wordlist is an external dependency and shouldn't be modified
[1] The trust / security dicussion is an interesting one in itself I believe, and I'm tempted to open a separate issue on that specific topic not to sidetrack this one too much. (Done, see #6 ) [2] Not excluding that there could be other use cases than the ones that I can think of, see (1)
@Bassetts Althought I would be reluctant to use it here for the reasons above, you might be interested in rust-embed for the purpose of working with separate source files while still shipping a single binary. And it is a macro, which could also be interesting. : )
@gonzalo-bulnes I was looking into the options for loading a file into the binary at compile time the other day and there is the include_str!
macro.
I don't think it is particularly suitable for this application as you would then just have one huge static string embedded into the binary.
I also came across support for code generation by using a build script. That option is very interesting to me as you could have a build step that takes the word list file and generates code for you e.g. making each line into a variable, or putting all the lines into a HashMap
.
Doing code generation would mean your word list text file can just be committed to source control, which would give you all the benefits that come with that such as;
diff
between the file in your repo and the one downloaded directly from the "upstream" (in this case EFF)All that would then need to be done in terms of an audit is to verify the code generation code doesn't modify the text when generating the code, which I think should be fairly simple.
This also has the benefit that if someone wanted to provide a binary with an alternate word list then they could fork your repo, replace the word list with their own file (as long as it is in the expected format), and then just build. The code generation will take care of the rest.
For the reasons you laid out above this probably isn't the approach to take right now, but I thought it was worth highlighting as I found it very interesting, and we are both learning Rust 🦀 here. I may fire up a branch at some point to play with the code generation as it seems very interesting.
Code generation using a build script is very interesting indeed! Thanks for the pointer @Bassetts!
If you skipped the context above, here is the description of the issue :wink:
The data of EFF's long wordlist should live in a different file
As it is currently, that long piece of data (7776 numbers -dice throws- and 7776 words) is bloating the dice/wordlist/mod.rs
file.
Github does a pretty good job, only skipping syntactic coloration for those lines, but otherwise editing the file isn't pleasant. Also, the data is unlikely to change while the logic certainly will.
Objective: move data to a different file, in the form of constants so that it doesn't get mixed with the code that is likely to be improved (and change) over time, so that:
dice/wordlist/mod.rs
is easier to work with (more colors, less scrolling)...Challenge: as explained in the link above, constants require a type annotation... what is their type?
Bonus points: I think that probably the only constant that needs to be public is the eff_long_wordlist
itself, keeping the other ones private would be nice.
This is a refactoring, so there is no need to add new tests, the tests that are already written are sufficient. The goal is to make them green once the move is over :heavy_check_mark:
Of course, I'm happy to help at any stage :slightly_smiling_face: I am learning Rust as well!
I would like to pick this issue up please :)
Edit: If you are looking to make your first (or second, or third!) open-source contribution, welcome! You can read the original comments in chronological order for context, or jump to the issue re-definition below! :slightly_smiling_face:
As it is currently, that long piece of data (7776 numbers -dice throws- and 7776 words) is bloating the
dice/wordlist/mod.rs
file.Github does a pretty good job, only skipping syntactic coloration for those lines, but otherwise editing the file isn't pleasant. Also, the data is unlikely to change while the logic certainly will.
I made an attempt to move the
throws
andwords
definitions to a different file, but Rust didn't like the variable assignments outside a function. I then tried encapsulating them in two functions, but got a bit lots with the ownership details as that implied extra borrowing...I certainly need to understand better the ownership model, but I'm also sure there must be a pattern for this use case. Do you know of one? Please let me know! :slightly_smiling_face: