jhudsl / ari

:dancers: The Automated R Instructor
https://jhudatascience.org/ari/
Other
146 stars 37 forks source link

ariExtra immigration #46

Closed howardbaek closed 10 months ago

howardbaek commented 1 year ago

Purpose/implementation Section

Move a few functions from ariExtra into ari so that users don't have to install ariExtra when they run Coqui TTS. They would only need ari and text2speech.

What changes are being implemented in this Pull Request?

Functions that immigrated from ariExtra:

download_gs_file.R

pptx_notes.R

UPDATE (10/16/2023)

Eventually, I removed the above functions to two separate R packages

cansavvy commented 1 year ago

A couple questions:

1) Will this alone get rid of the requirement for ariExtra for downstream mario/loqui? Is there a way we can test that? 2) We should document this and about ariExtra about how it is expected to be used in the future. We might want to figure out what a good plan for this is, that doesn't require us to maintain more packages, but also makes it clear to users as to where they go for what functions. 3) I'm assuming the addition of these functions doesn't make this package too big? https://community.rstudio.com/t/data-size-limits-for-packages/3748/2 But we should check that. When you run devtools::build() how big is the resulting file?

howardbaek commented 1 year ago

A couple questions:

  1. Will this alone get rid of the requirement for ariExtra for downstream mario/loqui? Is there a way we can test that?
  2. We should document this and about ariExtra about how it is expected to be used in the future. We might want to figure out what a good plan for this is, that doesn't require us to maintain more packages, but also makes it clear to users as to where they go for what functions.
  3. I'm assuming the addition of these functions doesn't make this package too big? https://community.rstudio.com/t/data-size-limits-for-packages/3748/2 But we should check that. When you run devtools::build() how big is the resulting file?
  1. I was on this development branch and clicked on "Install Package" in RStudio's Build tab to install this branch of the ari package locally and tested the following:
## Proof of concept script
library(ari)

gs_url <- "https://docs.google.com/presentation/d/1sFsRXfK7LKxFm-ydib5dxyQU9fYujb95katPRu0WVZk/edit#slide=id.p"

# Script (text)
pptx_path <- download_gs_file(gs_url, out_type = "pptx")
pptx_notes_vector <- pptx_notes(pptx_path)

# Images
pdf_path <- download_gs_file(gs_url, out_type = "pdf")
filenames <- pdf_to_pngs(pdf_path)

# Create a video from images and text
ari_spin(images = filenames, paragraphs = pptx_notes_vector, service = "coqui")

Everything ran smoothly, so looks like we've removed ariExtra! I haven't tested this on mario yet though. I'll test it right now, but did you have any other ideas on how to test this on mario?

  1. Agreed, I'll set up a meeting to talk over this.
  2. I ran devtools::build() and the resulting file, ari_0.4.1.tar.gz is 230.2 KB.
howardbaek commented 1 year ago

@cansavvy Do you mind approving this PR so I can merge into main?

cansavvy commented 1 year ago

@cansavvy Do you mind approving this PR so I can merge into main?

Should we wait until we get a chance to talk to @seankross about ari tomorrow?

cansavvy commented 1 year ago

After #45 is merged you’ll need to merge main into here to update this branch. I think some of these failed checks will be resolved by the changes in #45 but let’s see. Mainly you’ll want to pull the GitHub version of text2speech since it has the updates you need that aren’t in CRAN yet

cansavvy commented 1 year ago

We’ll just need to address the failed checks:

ERROR: lazy loading failed for package ‘ari’

it can’t load pdftools so we’ll need to look into that. May need to add it as a dependency if it is not already listed. Or we should look into why it can’t seem to be loaded.

howardbaek commented 1 year ago

When I ran devtools::check(), I kept on getting this error message:

Running examples in ‘ari-Ex.R’ failed

The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: pad_wav
> ### Title: Pad Wave Objects
> ### Aliases: pad_wav
> 
> ### ** Examples
> 
> wav_list <- list(
+   tuneR::noie(duration = 2000),
+   tuneR::noise(duration = 1000)
+ )
Error: 'noie' is not an exported object from 'namespace:tuneR'
Execution halted

However, the error message does not appear when running devtools::check() on GitHub Actions. To avoid seeing this message locally, I decided to wrap the examples in a \dontrun{} chunk.

@cansavvy if you are okay with this, do you mind approving this PR so I can merge into main?

howardbaek commented 1 year ago

Looks like I can't merge unless the code owner, @seankross , approves

seankross commented 1 year ago

I would prefer not to add anything else into Ari until the new text2speech is on CRAN. Once that happens there needs to be a major Ari code review and subsequent implementation changes. Once that is stable we can open Ari up to further development.

howardbaek commented 1 year ago

That makes sense to me, @seankross

howardbaek commented 1 year ago

@seankross Now that text2speech is on CRAN, when should we start the major code review of ari?

seankross commented 1 year ago

Can you schedule me for an hour and we can come up with a roadmap and start working? https://calendly.com/seankross

On Tue, Aug 15, 2023 at 12:16 PM Howard Baek @.***> wrote:

@seankross https://github.com/seankross When should we start the major code review of ari?

— Reply to this email directly, view it on GitHub https://github.com/jhudsl/ari/pull/46#issuecomment-1679463884, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN4JJLDSWSXNTFWNNLGY2TXVPDHTANCNFSM6AAAAAAYY6K2BU . You are receiving this because you were mentioned.Message ID: @.***>

howardbaek commented 1 year ago

Yes, of course

On Tue, Aug 15, 2023 at 3:28 PM Sean Kross @.***> wrote:

Can you schedule me for an hour and we can come up with a roadmap and start working? https://calendly.com/seankross

On Tue, Aug 15, 2023 at 12:16 PM Howard Baek @.***> wrote:

@seankross https://github.com/seankross When should we start the major code review of ari?

— Reply to this email directly, view it on GitHub https://github.com/jhudsl/ari/pull/46#issuecomment-1679463884, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAN4JJLDSWSXNTFWNNLGY2TXVPDHTANCNFSM6AAAAAAYY6K2BU>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/jhudsl/ari/pull/46#issuecomment-1679711539, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMDQK4AXVO5ORIJGPRESIK3XVPZZ7ANCNFSM6AAAAAAYY6K2BU . You are receiving this because you were mentioned.Message ID: @.***>

howardbaek commented 1 year ago

I've completed the first bullet in https://github.com/jhudsl/ari/issues/49, "get rid of text2speech specific code". Basically, I've changed the function arguments of ari_narrate() and ari_spin() to tts_engine, tts_engine_args, and tts_engine_auth. This lets anyone use ari and supply their own tts functionality. Of course, ari requires the tts functionality to come in a certain shape, which I will document in a future branch.

If you are wondering what the relationships are between the main ari_*() functions, I've documented it here.

@seankross @cansavvy Can we merge this PR and close this off so I can start working on the other bullets from https://github.com/jhudsl/ari/issues/49 in a new branch? This PR is getting really big.

muschellij2 commented 1 year ago

Is the default behavior the same with those functions or are these breaking changes to others code if they update to this version?

On Thu, Aug 31, 2023 at 4:03 PM Howard Baek @.***> wrote:

I've completed the first bullet in #49 https://github.com/jhudsl/ari/issues/49, "get rid of text2speech specific code". Basically, I've changed the function arguments of ari_narrate() and ari_spin() to tts_engine, tts_engine_args, and tts_engine_auth. This lets anyone use ari and supply their own tts functionality. Of course, ari requires the tts functionality to come in a certain shape, which I will document in a future branch.

If you are wondering what the relationships are between the main ari_*() functions, I've documented it here https://docs.google.com/presentation/d/1Vjvq7PYuWsTkGi2EkXpnk0KtQYhbPSidBhMFQcqyb8I/edit#slide=id.g24d14489617_0_0 .

@seankross https://github.com/seankross @cansavvy https://github.com/cansavvy Can we merge this PR and close this off so I can start working on the other bullets from #49 https://github.com/jhudsl/ari/issues/49 in a new branch? This PR is getting really big.

— Reply to this email directly, view it on GitHub https://github.com/jhudsl/ari/pull/46#issuecomment-1701707241, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIGPLQND4HKJUL3IKHZRM3XYDU2HANCNFSM6AAAAAAYY6K2BU . You are receiving this because your review was requested.Message ID: @.***>

seankross commented 1 year ago

Assume the next version of Ari will introduce breaking changes. The version number will go to 1.. to reflect that.

On Thu, Aug 31, 2023 at 1:52 PM John Muschelli @.***> wrote:

Is the default behavior the same with those functions or are these breaking changes to others code if they update to this version?

On Thu, Aug 31, 2023 at 4:03 PM Howard Baek @.***> wrote:

I've completed the first bullet in #49 https://github.com/jhudsl/ari/issues/49, "get rid of text2speech specific code". Basically, I've changed the function arguments of ari_narrate() and ari_spin() to tts_engine, tts_engine_args, and tts_engine_auth. This lets anyone use ari and supply their own tts functionality. Of course, ari requires the tts functionality to come in a certain shape, which I will document in a future branch.

If you are wondering what the relationships are between the main ari_*() functions, I've documented it here < https://docs.google.com/presentation/d/1Vjvq7PYuWsTkGi2EkXpnk0KtQYhbPSidBhMFQcqyb8I/edit#slide=id.g24d14489617_0_0>

.

@seankross https://github.com/seankross @cansavvy https://github.com/cansavvy Can we merge this PR and close this off so I can start working on the other bullets from #49 https://github.com/jhudsl/ari/issues/49 in a new branch? This PR is getting really big.

— Reply to this email directly, view it on GitHub https://github.com/jhudsl/ari/pull/46#issuecomment-1701707241, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAIGPLQND4HKJUL3IKHZRM3XYDU2HANCNFSM6AAAAAAYY6K2BU>

. You are receiving this because your review was requested.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/jhudsl/ari/pull/46#issuecomment-1701765439, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN4JJOLNMSRYC4I7Y6NKF3XYD2R5ANCNFSM6AAAAAAYY6K2BU . You are receiving this because you were mentioned.Message ID: @.***>

howardbaek commented 1 year ago

The function will act the same, but the argument names changed

seankross commented 1 year ago

Also the number of and position of arguments will likely change.

On Thu, Aug 31, 2023 at 3:38 PM Howard Baek @.***> wrote:

The function will act the same, but the argument names changed

— Reply to this email directly, view it on GitHub https://github.com/jhudsl/ari/pull/46#issuecomment-1701871966, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN4JJONFFANZJI65HL4EVTXYEG4XANCNFSM6AAAAAAYY6K2BU . You are receiving this because you were mentioned.Message ID: @.***>

cansavvy commented 1 year ago

After discussing this PR with @seankross we need to figure out how to re organize these functions into modular packages. See issue #51 I've turned this PR to a draft so we are less likely to accidentally merge it.

howardbaek commented 10 months ago

This 4-month old PR is ready to merge! I addressed the comments left by Sean in this branch and the two sub-branches (burn-subtitles, reduce-arguments).