Closed antoniolinhart closed 6 months ago
@mgeisler What do you think about the implementation of splitting the template file? In this version I chose to add messages that did not have a matching source to a catalog that was saved in the file name specified in the input params.
I'm open to changing how we use that pot-file input param, I thought that this made the most sense for now. Would you want to add another optional field for output directory if we have a depth > 0?
- Mimics the directory structure for the book
I think this is a key decision. I was thinking we would use the Book
(from RenderContext
) as the driver of this — not the file names.
We currently iterate over the BookItem
s in depth-first order (via Book::iter
), but we could instead iterate over the sections
field. I think that would let you avoid having to think of where SUMMARY.md
fits into the picture.
As for what to name the POT files, I guess this can be derived from Chapter::path
? Perhaps we need a top-level output directory (po/
) and then we create po/foo.pot
for a Chapter { path:
foo.md}
chapter. We would put all sub_items
into the same po/foo.pot
file if we're splitting at depth = 1
. If we use depth = 2
, then we would create po/foo.pot
with only the data from this chapter and create po/foo/bar.pot
with everything from the sub_items
.
There is no guarantee that nested chapters are stored in nested files, but I think we should be safe if we follow the logical structure of the Book
in this manner.
- Mimics the directory structure for the book
I think this is a key decision. I was thinking we would use the
Book
(fromRenderContext
) as the driver of this — not the file names.We currently iterate over the
BookItem
s in depth-first order (viaBook::iter
), but we could instead iterate over thesections
field. I think that would let you avoid having to think of whereSUMMARY.md
fits into the picture.
Oh! That's a good point for using the sections directly instead of that built in iterator. Until looking into the Book items struct it was hard to know exactly why that worked the way it did.
If I follow through all of the Chapters into their sub items and take the content from each Chapter and so on, that should include all the messages, right? I'm running into an interesting situation where there's some messages that aren't there that should be. Comparing between a current messages.pot and a messages.pot from my program with depth 0, it's missing about 10 messages that all appear to be the Rust scripts that get included as sample code. There are many scripts included in the messages that are sample code, there just happen to be some that don't come through?
As for what to name the POT files, I guess this can be derived from
Chapter::path
? Perhaps we need a top-level output directory (po/
) and then we createpo/foo.pot
for aChapter { path:
foo.md}
chapter. We would put allsub_items
into the samepo/foo.pot
file if we're splitting atdepth = 1
. If we usedepth = 2
, then we would createpo/foo.pot
with only the data from this chapter and createpo/foo/bar.pot
with everything from thesub_items
.There is no guarantee that nested chapters are stored in nested files, but I think we should be safe if we follow the logical structure of the
Book
in this manner.
This makes more sense to follow the book instead. I think it will lead to the split depth input being a lot more logical too. Thank you!
Still working through the revised implementation!
There are many scripts included in the messages that are sample code, there just happen to be some that don't come through?
Much of the Rust code is included via the {{#include foo.rs}}
system. That syntax is a mdbook extension. This is done by the built-in links
preprocessor.
Could the scripts be related to this?
Oh... another idea: are you perhaps seeing the effects of #75 by @dyoo? It was merged just a few days ago :slightly_smiling_face:
There is no guarantee that nested chapters are stored in nested files, but I think we should be safe if we follow the logical structure of the
Book
in this manner.This makes more sense to follow the book instead. I think it will lead to the split depth input being a lot more logical too. Thank you!
Still working through the revised implementation!
Sounds great, thank you so much for your help here!
Hi @antoniolinhart, just a small thing: the format check fails. You should configure your editor to call rustfmt
automatically (or run dprint fmt
on the code).
There are many scripts included in the messages that are sample code, there just happen to be some that don't come through?
Much of the Rust code is included via the
{{#include foo.rs}}
system. That syntax is a [mdbook extension]https://rust-lang.github.io/mdBook/format/mdbook.html#including-files). This is done by the built-inlinks
preprocessor.Could the scripts be related to this?
Oh... another idea: are you perhaps seeing the effects of #75 by @dyoo? It was merged just a few days ago 🙂
Looks like I need to pull main
more often :laughing: This seems to be what my issue was. Thank you!
Hey @antoniolinhart, I skimmed through the updated code — from the test cases it looks like it works? Super cool!
You should rebase onto the latest main
and resolve the merge conflict.
Hi @antoniolinhart, thanks for the cleanup. I tried checking out the branch and ran
% MDBOOK_OUTPUT='{"xgettext": {"pot-file": "messages.pot", "depth": 1}}' mdbook build -d po1
2023-09-29 10:38:28 [INFO] (mdbook::book): Book building has started
2023-09-29 10:38:28 [INFO] (mdbook::book): Running the xgettext backend
2023-09-29 10:38:28 [INFO] (mdbook::renderer): Invoking the "xgettext" renderer
This gives me
% tree po1 (~/src/comprehensive-rust)
po1
└── src
├── android
│ ├── aidl.pot
│ ├── build-rules.pot
│ ├── interoperability.pot
│ ├── logging.pot
│ └── setup.pot
├── android.pot
I had hoped there would be a single POT file for each chapter in the top-level. Like this:
% tree expected1 (~/src/comprehensive-rust)
expected1
├── android.pot
├── async.pot
├── bare-metal.pot
├── basic-syntax.pot
├── cargo.pot
I suppose the problem in an off-by-one error somewhere :slightly_smiling_face:
Hi @antoniolinhart, thanks for the cleanup. I tried checking out the branch and ran
% MDBOOK_OUTPUT='{"xgettext": {"pot-file": "messages.pot", "depth": 1}}' mdbook build -d po1 2023-09-29 10:38:28 [INFO] (mdbook::book): Book building has started 2023-09-29 10:38:28 [INFO] (mdbook::book): Running the xgettext backend 2023-09-29 10:38:28 [INFO] (mdbook::renderer): Invoking the "xgettext" renderer
This gives me
% tree po1 (~/src/comprehensive-rust) po1 └── src ├── android │ ├── aidl.pot │ ├── build-rules.pot │ ├── interoperability.pot │ ├── logging.pot │ └── setup.pot ├── android.pot
I had hoped there would be a single POT file for each chapter in the top-level. Like this:
% tree expected1 (~/src/comprehensive-rust) expected1 ├── android.pot ├── async.pot ├── bare-metal.pot ├── basic-syntax.pot ├── cargo.pot
I suppose the problem in an off-by-one error somewhere 🙂
The challenge with this is that Android isn't actually a top-level chapter, mdbook treats it as a part title and doesn't contain the sub-items as expected like "interoperability" "aidl", "build-rules". I think if we wanted something like this we'd have to leverage the listed directory in the code, or change the SUMMARY file to make it easier
The challenge with this is that Android isn't actually a top-level chapter, mdbook treats it as a part title and doesn't contain the sub-items as expected like "interoperability" "aidl", "build-rules". I think if we wanted something like this we'd have to leverage the listed directory in the code, or change the SUMMARY file to make it easier
Oh! I had not realized this until now...
We can change our own SUMMARY.md
file, but we won't be able to the change the upstream format easily.
I think the code right now splits the POT file into ~63 files, whereas we only have ~13 parts. Do you think you can add code for tracking the parts? I would suggest transforming the part title into a slug using something like this:
fn slug(title: &str) -> String {
title
.split_whitespace()
.map(|word| {
word.to_lowercase()
.chars()
.filter(|&ch| ch == '-' || !ch.is_ascii_punctuation())
.collect::<String>()
})
.filter(|word| !word.is_empty())
.collect::<Vec<_>>()
.join("-")
}
I tested it a bit on our own titles plus some headings from the Rust Book and they look reasonable.
Hi @antoniolinhart, thanks for the cleanup. I tried checking out the branch and ran
% MDBOOK_OUTPUT='{"xgettext": {"pot-file": "messages.pot", "depth": 1}}' mdbook build -d po1 2023-09-29 10:38:28 [INFO] (mdbook::book): Book building has started 2023-09-29 10:38:28 [INFO] (mdbook::book): Running the xgettext backend 2023-09-29 10:38:28 [INFO] (mdbook::renderer): Invoking the "xgettext" renderer
This gives me
% tree po1 (~/src/comprehensive-rust) po1 └── src ├── android │ ├── aidl.pot │ ├── build-rules.pot │ ├── interoperability.pot │ ├── logging.pot │ └── setup.pot ├── android.pot
I had hoped there would be a single POT file for each chapter in the top-level. Like this:
% tree expected1 (~/src/comprehensive-rust) expected1 ├── android.pot ├── async.pot ├── bare-metal.pot ├── basic-syntax.pot ├── cargo.pot
I suppose the problem in an off-by-one error somewhere 🙂
After the most recent improvements I think it's closer to expectations - depth 1 splits each 'part' into a separate catalog.
Here's the output I get for the following depths (using the Android chapter as an example after depth = 1):
$ MDBOOK_OUTPUT='{"xgettext": {"depth": 1}}' mdbook build -d po/XX
2023-10-05 17:36:46 [INFO] (mdbook::book): Book building has started
2023-10-05 17:36:46 [INFO] (mdbook::book): Running the xgettext backend
2023-10-05 17:36:46 [INFO] (mdbook::renderer): Invoking the "xgettext" renderer
$ tree po/XX
po/XX
├── android.pot
├── bare-metal-afternoon.pot
├── bare-metal-morning.pot
├── concurrency-afternoon.pot
├── concurrency-morning.pot
├── day-1-afternoon.pot
├── day-1-morning.pot
├── day-2-afternoon.pot
├── day-2-morning.pot
├── day-3-afternoon.pot
├── day-3-morning.pot
├── final-words.pot
├── solutions.pot
└── summary.pot
$ MDBOOK_OUTPUT='{"xgettext": {"depth": 2}}' mdbook build -d po/XX
2023-10-05 17:36:46 [INFO] (mdbook::book): Book building has started
2023-10-05 17:36:46 [INFO] (mdbook::book): Running the xgettext backend
2023-10-05 17:36:46 [INFO] (mdbook::renderer): Invoking the "xgettext" renderer
$ tree po/XX
po/XX
├── android
│ ├── aidl.pot
│ ├── build-rules.pot
│ ├── exercises.pot
│ ├── interoperability.pot
│ ├── logging.pot
│ ├── setup.pot
│ └── welcome.pot
$ MDBOOK_OUTPUT='{"xgettext": {"depth": 3}}' mdbook build -d po/XX
2023-10-05 17:36:46 [INFO] (mdbook::book): Book building has started
2023-10-05 17:36:46 [INFO] (mdbook::book): Running the xgettext backend
2023-10-05 17:36:46 [INFO] (mdbook::renderer): Invoking the "xgettext" renderer
$ tree po/XX
po/XX
├── android
│ ├── aidl
│ │ ├── changing-api.pot
│ │ ├── client.pot
│ │ ├── deploy.pot
│ │ ├── implementation.pot
│ │ ├── interface.pot
│ │ └── server.pot
│ ├── aidl.pot
│ ├── build-rules
│ │ ├── binary.pot
│ │ └── library.pot
│ ├── build-rules.pot
│ ├── exercises.pot
│ ├── interoperability
│ │ ├── with-c.pot
│ │ └── with-java.pot
│ ├── interoperability.pot
│ ├── logging.pot
│ ├── setup.pot
│ └── welcome.pot
$ MDBOOK_OUTPUT='{"xgettext": {"depth": 4}}' mdbook build -d po/XX
2023-10-05 17:36:46 [INFO] (mdbook::book): Book building has started
2023-10-05 17:36:46 [INFO] (mdbook::book): Running the xgettext backend
2023-10-05 17:36:46 [INFO] (mdbook::renderer): Invoking the "xgettext" renderer
$ tree po/XX
po/XX
├── android
│ ├── aidl
│ │ ├── changing-api.pot
│ │ ├── client.pot
│ │ ├── deploy.pot
│ │ ├── implementation.pot
│ │ ├── interface.pot
│ │ └── server.pot
│ ├── aidl.pot
│ ├── build-rules
│ │ ├── binary.pot
│ │ └── library.pot
│ ├── build-rules.pot
│ ├── exercises.pot
│ ├── interoperability
│ │ ├── with-c
│ │ │ ├── calling-c-with-bindgen.pot
│ │ │ └── calling-rust-from-c.pot
│ │ ├── with-c.pot
│ │ └── with-java.pot
│ ├── interoperability.pot
│ ├── logging.pot
│ ├── setup.pot
│ └── welcome.pot
Hi @antoniolinhart! Happy new year :partying_face:
Would you be able to rebase this PR on top of the latest main
? We should focus on getting it merged to prevent it from bit-rotting!
Even if the functionality to merge the split PO files are not yet there, the splitting functionality could be useful to someone (they would then have to write a tiny script to msgcat
everything back together before using the translation).
Hi @antoniolinhart! Happy new year 🥳
Would you be able to rebase this PR on top of the latest
main
? We should focus on getting it merged to prevent it from bit-rotting!Even if the functionality to merge the split PO files are not yet there, the splitting functionality could be useful to someone (they would then have to write a tiny script to
msgcat
everything back together before using the translation).
Happy new year! I got taken away with some high priority team projects so it's been a while since I've been able to look at this. That's a good point! My goal will be to (this week/next week) get the comments responded to / get it rebased on main
since I know the longer I wait the harder it will be :sweat_smile:
Thanks for your patience on this!
Happy new year! I got taken away with some high priority team projects so it's been a while since I've been able to look at this.
That's completely okay! The functionality here is luckily not blocking anything right now.
My goal will be to (this week/next week) get the comments responded to / get it rebased on
main
since I know the longer I wait the harder it will be 😅
Thanks for sticking with it! I think the conflicts should be small at this point, so it should be possible to revive this PR without too much trouble.
Should be ready for another round of review @mgeisler ! Thank you for your extended patience and support!
Should be ready for another round of review @mgeisler ! Thank you for your extended patience and support!
I'm the one thanking! Let me look it through now.
Hey @antoniolinhart, I hope the changes in #107 doesn't cause too big conflicts here! I think we can merge this straight after it gets rebased.
@mgeisler
Should be good to go! It appears that the diff base is not against the correct source so it's showing that I changed a lot of files that I obtained from rebasing on the updated code. I tried looking around but it appears that this merge is into google:main
which should have the updated files?
@mgeisler
Should be good to go!
Great!
It appears that the diff base is not against the correct source so it's showing that I changed a lot of files that I obtained from rebasing on the updated code. I tried looking around but it appears that this merge is into
google:main
which should have the updated files?
Yeah, something is off here — I'll try to untangle it and then merge the PR. It'll take me a day of two to get around to it :smile:
@mgeisler Should be good to go!
Great!
It appears that the diff base is not against the correct source so it's showing that I changed a lot of files that I obtained from rebasing on the updated code. I tried looking around but it appears that this merge is into
google:main
which should have the updated files?Yeah, something is off here — I'll try to untangle it and then merge the PR. It'll take me a day of two to get around to it 😄
Thanks!! Also the clippy failure was due to me leaving some debugging logs in by accident (after the last time I ran clippy) so that should be fixed now :+1:
Attention: Patch coverage is 91.91176%
with 66 lines
in your changes are missing coverage. Please review.
Project coverage is 90.58%. Comparing base (
a4ca49b
) to head (440e352
). Report is 28 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks!! Also the clippy failure was due to me leaving some debugging logs in by accident (after the last time I ran clippy) so that should be fixed now 👍
Hey Antonio, I think I got your changes back into a simple branch. I don't actually know if you did anything wrong here. What I did on my side was to
split-pot-file
branch was.split-pot-file
to main
to throw away all commits and merges and what not.split-pot-file
back to the temporary branch. This added all the changes back, but as a delta to main
.If the tests pass, then I'll merge this next.
Thanks!! Also the clippy failure was due to me leaving some debugging logs in by accident (after the last time I ran clippy) so that should be fixed now 👍
Hey Antonio, I think I got your changes back into a simple branch. I don't actually know if you did anything wrong here. What I did on my side was to
- Create a temporary branch at the point where your
split-pot-file
branch was.- Hard reset
split-pot-file
tomain
to throw away all commits and merges and what not.- Soft reset
split-pot-file
back to the temporary branch. This added all the changes back, but as a delta tomain
.- Commit this under your name and force push back to your repository.
If the tests pass, then I'll merge this next.
Wonderful! Thank you I will try this out next time I run into the same issue.
Hey all! This is a PR in relation to #67 to split the pot file depending on the specified depth. Another PR will be submitted to change
gettext
to merge a directory ofpot
files.xgettext
to take in an integer for the depth to split the POT file. Remove the input-file param