Open olivroy opened 1 month ago
Thanks again for building this! I have been using this for my project, but I wanted to intersperse different levels of headings with my TODO items. I had a lot of trouble figuring out context for my to-do items in the middle of a large notebook.
To accomplish this, I wrote a printing function that takes proj_outline()
output and translates the tibble into something we can display with cli::tree()
. Here's an example tree with to-do items interspersed:
test.qmd
└─Test
├─Quarto
├─Running Code
│ ├─TODO: fix this in the code
│ └─A sub header
│ └─TODO: here's a todo in the text
└─Back to header 1
├─Dont skip me
│ └─header 5
│ └─TODO: testing section
└─Another sub header
└─TODO: section test
I am still playing with the formatting. I started using cli::format_inline()
to incorporate the contextual links you designed, but it doesn't show up here.
Below is the code I'm using. First, I simplify the outline tibble to something that makes sense to me. Then, I make the indent
variable which is based off of n_leading_hash
. Then, I transform a tibble with "indents" into a tree structure for cli::tree()
. I can rewrite this as a pull request eventually, but I want to take a shot at using lightparser
and spend more time customizing the appearance with colors and stuff before I integrate into your codebase.
outline_data <- reuseme::file_outline(path = "test.qmd") %>%
# Convert the many is_ columns into mutually exclusive "outline row types"
pivot_longer(
names_to = "type", names_prefix = "is_", c(
starts_with("is_"), -is_md, -is_second_level_heading_or_more, -is_roxygen_comment, -is_notebook
)
) %>%
# # Double check that types are mututally exclusive
# filter(all(value == F), .by = c(file_short, title, line_id))
# filter(sum(value) > 1, .by = c(file_short, title, line_id))
filter(value == T) %>%
# We drop these because they don't serve to add much context to TODOs (they don't affect heirarchy)
filter(type != "tab_or_plot_title") %>%
# Some useful definitions!
mutate(
title = coalesce(outline_el, title_el),
file_short = fs::path_file(file),
n_leading_hash = type %>% case_match(
c("todo_fixme", "tab_or_plot_title") ~ NA,
.default = n_leading_hash
)
) %>%
# For each file, stick a item at the top of the outline
group_by(file, file_short) %>%
group_modify(\(data, group) data %>% add_row(
.before = 0,
n_leading_hash = -1,
title = group$file_short,
type = "file"
)) %>%
mutate(
# Processing how title displays based on type
print_title = type %>% case_match(
"todo_fixme" ~ link,
.default = link_rs_api
) %>% coalesce(title) %>% map_chr(cli::format_inline),
# Assign TODO items (and other items missing n_leading_hash)
# to be indented under the last seen header level
indent = coalesce(n_leading_hash, zoo::na.locf0(n_leading_hash+1)),
# If there are any headers that skip an intermediate level, pick them up
skip_level = indent > lag(indent)+1,
skip_level_should_be = ifelse(skip_level, lag(indent)+1, NA),
skip_level_adjust = case_when(
# All the items below on the outline should be adjusted backwards
skip_level ~ skip_level_should_be-indent,
# Unless we reach a point on the outline where we're back up in
# the hierachy, so stop adjusting.
indent <= zoo::na.locf0(skip_level_should_be) ~ 0
) %>%
# Carry the adjustments to later rows
zoo::na.locf0() %>% coalesce(0),
indent = indent + skip_level_adjust
) %>%
ungroup() %>%
select(title, type, n_leading_hash, indent, print_title)
outline_data %>%
mutate(
# Give items IDs so titles do not have to be unique
item_id = as.character(row_number()),
indent_wider = indent,
x = TRUE
) %>%
# We need these wide cumsum `header1` type fields to determine which items belong to which parents
pivot_wider(names_from = indent_wider, values_from = x, values_fill = F, names_prefix = "header") %>%
mutate(across(starts_with("header"), cumsum)) %>%
# For each row, pick the IDs of all direct children from the outline
pmap(function(...) with(list(..., childdata = .), tibble(
title,
print_title,
indent,
item_id,
type,
parent_level_id = get(stringr::str_c("header", indent)),
children_ids = childdata %>%
dplyr::rename(childindent = indent) %>%
dplyr::filter(
childindent == indent+1,
cumsum(childindent == indent) == parent_level_id
) %>%
dplyr::pull(item_id) %>% list()
))) %>%
list_rbind() %>%
# View()
select(item_id, children_ids, print_title) %>%
cli::tree()
And here's the qmd text I used:
---
title: "Test"
format: html
editor: visual
---
## Quarto
Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see <https://quarto.org>.
## Running Code
When you click the **Render** button a document will be generated that includes both content and the output of embedded code. You can embed code like this:
```{r}
1 + 1
# TODO: fix this in the code
You can add options to executable code like this
#| echo: false
2 * 2
The echo: false
option disables the printing of code (only output is displayed).
TODO: here's a todo in the text
TODO: testing section
TODO: section test
thank you so much for your insightful comments! I am trying to wrap up parsing roxygen2 comments properly. I will get to this soon!
I really like that you created an example code! I will try to see what I can come up with!
For indentation, it is planned. As you can see, there are already different colors for first level headings vs the rest.
I divided topics between "important" and "not_important", but I can add more categories, colors and more levels of indentation!
Current iteration of my version! Fork and pull request incoming this week...
Features I plan to look at:
Parent -> Child
combined nodes for brevitycli_h1()
type markup, I could adapt it to use special styles for the top-level nodes in the treeAfter playing with the complete_todo()
links a bit, I really like the idea but I don't find it useful enough to keep for me. I update todo annotations relatively rarely, so it isn't a hassle to click the pound sign (to jump to the line) and manually delete the item. Additionally, sometimes TODO items have code on the left side and removing the line breaks the code. I'm not sure supporting this feature is worth all the regex work and potential risk of damaging my scripts.
Good to hear!
For complete_todo()
, I understand, I did fix that up though in https://github.com/olivroy/reuseme/commit/14ec8e7146bdb05454f5193507e76ad98d563e89.
Let me know if it still breaks for you.
Also instead of showing Done v?, could also just be green done clickable emoji?
I just disabled it generally in 92c7a66d0a835bea9246deae036931fae49394a1 (only keeping it for active file and files named TODO.R
Apologies in advance for potentially breaking your code
I think I will disable roxygen comments parsing
I really like how you style differently the text!
complete_todo()
links should probably just appear if you are requesting the active file only. You are right that you likely won't need to complete todos when requesting outline for another project.
@violetcereza I am looking forward to your PR! Feel free to open smaller chunks PRs first to get changes you'd like to see first. https://code-review.tidyverse.org/author/focused.html
Add as much snapshot tests as you can, so it is easy to visualize output. Therefore, if some things end up changing, we will have a visual reference. https://testthat.r-lib.org/articles/snapshotting.html
Let me know if you have any questions
Features I plan to look at:
- Collapse tree items with one child into
Parent -> Child
combined nodes for brevity
👍 looking forward to seeing examples of this!
- Somehow adapt to HTML output, so I can embed the TODO list into a notebook (or copy/paste into word processor) to show my boss
That is a good idea!
- If you wanted to use
cli_h1()
type markup, I could adapt it to use special styles for the top-level nodes in the tree
Not strong opinions there. Recently, I fixed titles h3 to escape markup https://github.com/olivroy/reuseme/commit/b4803a244a484a7b686d05f28ea7be91f4d412c3#diff-708c3b67fc1b91c821cc6c00c9c8108cabcdd44c7edd9366cf62a373b717ca0cR440 in outline.R
- Use lightparser
If possible, file this a separate PR?
- Rework how outlines are stored under the hood a bit
I will be happy to see this too! Separate PR if possible too :)
I am looking forward. Here, I am seeing 3 steps.
Add lightparser support for parsing md/ Rmd / qmd files.
Tweak how outline is stored to work better with tree
Add support for tree. (Ideally this would affect the print method) I can let you show me, but my initial thoughts are the following. (I really like how the tree looks, and will probably end up deprecating the current way of doing, but for now I'd like to keep both functional unless there are no drawbacks from using trees)
Add an option that tells that you want tree output options("reuseme.outline_tree")
for example (that would strip complete_todo()
.
At the end of file_outline()
, the following code.
if (isTRUE(getOption("reuseme.outline_tree") {
# modify result to be tree in `outline_tree()` helper.
result <- outline_tree(result)
class(result) <- c("cli_tree", "outline_report", "tbl_df", "tbl", "data.frame")
}
This way, this could just use the `cli:::print.cli_tree() method?
Alternatively (maybe better) refactor the print.outline_report()
to print differently whether the tree option is TRUE.
Let me know what you think!
Any (non-trivial) code related to trees should live in R/outline-tree.R
:)
Don't let this remove your creativity, these are just the thoughts I have assembled since yesterday.
Apologies also if some of my recent changes broke your code. (I am still experiencing). Hopefully it is not too bad to repair.
I have removed some variables from output:
line_id
to line
), added some features.Feel free to share any code. I can rework it to account for my recent changes. And make adjustments as needed to revert some changes.
Any other comment / feedback welcome! You inspired me to make changes I didn't think of at first (avoid printing complete_todo()
all the time is a big one!)
Thanks! I will be taking a look at it today...
Here's what I'm thinking big picture for the outline interface:
WORK
tag while I don't. Other people might have custom annotations that come with activated cli linkstype = "project"
and special custom annotation that activates reuseme::proj_switch
reuseme::outline
API is getting complicated, and a lot of the tasks do two things: filter the items and change how they displaydplyr
and the resulting table is passed to the print method. outline_report
and the following columns:file, title, type, line, indent
theme
API could provide total back compatibility. indent
rather than the parent/children IDs required by cli_tree()
. While both hierarchical structures have their strengths, it's easier to read scan down indent
in table format. However, rearranging items will break the structure of indents. I propose implementing sorting in the sort
expression to print.outline_report
to make this easy and flexible.proj_outline(proj = NULL)
If no proj
, proxy for dir_outline(rstudioapi::getActiveProject())
otherwise dir_outline(proj_list(proj))
.
dir_outline(recurse = T)
Run file_outline()
on everything and aggregate into tibble under top level folder item.
file_outline(path = rstudioapi::documentPath())
Run file_outline()
on everything and aggregate into tibble under top level folder item. Uses my algorithm to turn arbitrary indentation into strict hierarchical levels and make indent
variable more meaningful.
print.outline_report(outline, sort = F, collapse_nodes = T, prune_nodes = T, theme = NULL)
Prints a multicolored tree. Uses my algorithm to turn outline indentation into parent/child structure. Optionally collapse nodes with one child into Parent -> Child
nodes for brevity. Optionally prune nodes with no sub-annotations.
Allows sort
with a data mask that operates recursively on sub-tibbles of sibling items (with a nod to data.tree sorting).
theme
is a function(outline, indent)
that allows customization for output (TBD).
Please let me know if my proposed API simplifications destroy any functionality you were using.
proj_outline(pattern = "x")
# NOW
proj_outline() |>
filter(file |> str_detect("x"))
proj_outline(print_todo = F)
# NOW
proj_outline() |>
filter(type != "todo")
proj_outline(work_only = F)
# NOW
proj_outline() |>
print(prune_nodes = F)
proj_outline(alpha = T)
# NOW
proj_outline() |>
print(sort = title)
proj_outline(recent_only = T)
# NOW
# Only select 5 most recently modified files?
proj_outline() |>
mutate(modified = file %>% map_dbl(file.mtime)) |>
filter(modified >= modified |> unique() |> sort() |> nth(-5))
# Extra tricks: print only files that have TODO items, and only up to level 3 headings
# (include todos under omitted headings)
# (print headings with no todo items as well)
proj_outline() |>
filter(.by = file, any(type == "todo"), indent <= 3 | type != "heading") |>
print(prune_nodes = F)
data <- proj_outline()
data$file_path # redundant with file?
data$file_short # use fs::path_file()
data$file_ext # use fs::path_ext()
# These should be cleaned and aggregated into `title` with some choices made by `file_outline()`
c(data$content,
data$outline_el,
data$title_el, etc...)
# These should be internal to `print.outline_report`
c(data$link,
data$text_in_link,
data$style_fun,
data$link_rs_api,
data$file_hl
)
I like this! Do you want to open a PR, or do you want me to make these changes first?
Hey I know I am procrastinating writing unit tests on my PR...
But WRT to lightparser: I started trying out VSCode this week and I learned about "Language Server Protocols." Basically, there's too many types of files and too many IDEs and LSP provides a common interface and prevents the headache of re-implementing code analysis. I wonder if we could plug into the foldingRangeProvider
service from languageserver
to generate outlines.
I understand. On my side, I have been able to successfully move link creation inside the print method. I am thinking about what you are proposing.
Feel free to continue sharing what you find! The end goal is to improve our workflows, not to stick to a suboptimal solution
Please comment on this issue if you tried the
*_outline()
functions and would like some functionality or encountered something unexpectedcf https://github.com/rstudio/rstudioapi/issues/153#issuecomment-2116139239
Current plans
roxygen2::parse_package()
+ md (#29)Known issues