m0j0hn / editor-on-fire

Automatically exported from code.google.com/p/editor-on-fire
Other
0 stars 0 forks source link

Add ability to insert silence at beginning of chart #144

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
This was a previously proposed idea, but it has recently been requested again, 
so I'm putting it here.  The last time we visited this feature, it seemed like 
the OGG format would allow one OGG to be concatenated to another.  If this was 
so, we could easily have a silent OGG of the appropriate length generated and 
append the other file on a binary level.  If not, perhaps a stream copy process 
would work?

Original issue reported on code.google.com by raynebc on 19 Aug 2010 at 5:18

GoogleCodeExporter commented 9 years ago
I think the latest version of AlOGG might support encoding OGG Vorbis. If the 
append method works we could generate a blank SAMPLE of the length requested by 
the user, encode it as OGG Vorbis, and append the audio to it. Otherwise we 
would have to re-encode the OGG with the silence added.

Original comment by xander4j...@yahoo.com on 19 Aug 2010 at 12:35

GoogleCodeExporter commented 9 years ago
I think I remember that it did support encoding OGG, so if it does that well, 
we can remove the ogg2enc dependency.  Re-encoding could be pretty destructive 
though, we might want to investigate logic for performing a stream copy append 
if appending the entire files otherwise doesn't work.

Original comment by raynebc on 19 Aug 2010 at 2:23

GoogleCodeExporter commented 9 years ago
According to the Oggscissors website, OGG's smallest decode-able unit is around 
1024 audio samples (20ms).  This means that stream copying is going to be too 
inaccurate to use with this feature, we'd need to be able to insert increments 
of 1 millisecond of silence.

Original comment by raynebc on 23 Sep 2010 at 5:57

GoogleCodeExporter commented 9 years ago
I just encountered the Ogg Video Tools:
http://en.flossmanuals.net/TheoraCookbook/ManipulateOggTheoraFiles

If an OGG of silent audio was created to match the number of channels, sample 
rate and data rate of the chart audio, the oggCat utility should be able to 
concatenate them without any need to re-encode.

Original comment by raynebc on 13 Oct 2010 at 6:56

GoogleCodeExporter commented 9 years ago
Chatting on the Theora IRC channel, they're indicating that re-encoding would 
be necessary, as oggCat will likely merge as two separate streams instead of 
concatenating two files into one.

Original comment by raynebc on 13 Oct 2010 at 11:31

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Still, I think the vorbis player would play the two streams back to back if 
they both existed in the file, at least according to the vorbis documentation. 
It even says that the programmer should pay attention to what ov_read() returns 
because the format may change when it crosses into the next stream.

Original comment by xander4j...@yahoo.com on 14 Oct 2010 at 4:45

GoogleCodeExporter commented 9 years ago
I just used oggCat to put two files together and it worked perfectly. So we now 
have an option to add silence. We could include the executable for oggCat in 
the Windows version and require that oggvideotools be installed on Linux (the 
option can be grayed out if the oggCat command doesn't exist). We just need to 
make a function that creates a blank sample with a specified length, encode it, 
oggCat that file and the loaded file together, and load the newly created file.

Original comment by xander4j...@yahoo.com on 14 Oct 2010 at 4:58

GoogleCodeExporter commented 9 years ago
Despite what might have been said about chained files in FoF, I just tested one 
and it worked. This should be fairly easy to implement now that we know how to 
do it.

Original comment by xander4j...@yahoo.com on 14 Oct 2010 at 5:09

GoogleCodeExporter commented 9 years ago
With NewCreature's new code, EOF's a step closer to having this functionality.  
It's likely that a user may change the leading amount silence, so during the 
first instance, the original OGG can be backed up as currently designed, the 
leading silence can be generated, turned into an OGG (assuming that's necessary 
for oggCat) and concatenate the backup and the silence to create the new OGG 
file.  On subsequent uses, the backup should probably be re-used so that it can 
be restored if necessary.

I imagine that in most cases, people will want one or all of these options 
regarding adding silence:
1. Adding a specific number of milliseconds of silence
2. Adding a specific number of beats of silence (such as if they want to have 2 
measures worth of silence)
3. Taking the current first beat marker's position into account, determine how 
much silence would be needed to pad the current chart delay to a specific 
number of milliseconds or beats before the first beat marker

Original comment by raynebc on 15 Oct 2010 at 5:00

GoogleCodeExporter commented 9 years ago
All three options sound nice and could probably be implemented into one dialog 
with radio buttons and a text field for entering the numbers.

I agree about the backup. When the user adds silence it should be considered a 
change of leading silence to avoid confusion. If the user changes it again it 
will not add more silence but simply change the amount of leading silence 
compared to the original file.

Original comment by xander4j...@yahoo.com on 15 Oct 2010 at 6:23

GoogleCodeExporter commented 9 years ago
The silence adding function is working now. All we need is a menu option and 
dialog for this feature to be completed.

Original comment by xander4j...@yahoo.com on 15 Oct 2010 at 9:45

GoogleCodeExporter commented 9 years ago
I'm still confused about the creation of a WAV file.  I thought oggCat required 
you to give it two OGG files to join, will it accept an OGG and a WAV, or is 
the WAV file being converted to OGG before oggCat is used?

Original comment by raynebc on 15 Oct 2010 at 10:15

GoogleCodeExporter commented 9 years ago
The WAV file is created so it can be encoded into OGG with oggenc.

Original comment by xander4j...@yahoo.com on 15 Oct 2010 at 10:20

GoogleCodeExporter commented 9 years ago

Original comment by xander4j...@yahoo.com on 15 Oct 2010 at 10:46

GoogleCodeExporter commented 9 years ago
In the Theora IRC channel the other day, they mentioned that oggCat would work 
the way we want it to provided that the OGG headers match pretty much exactly.  
So if ogg2enc was used to convert an MP3 into the OGG used for the chart audio, 
and ogg2enc was also used to generate the silence, this could definitely work.  
We may need to test this feature out using a guitar.ogg that was encoded 
elsewhere, such as from Audacity.  It's possible that this would cause oggCat 
to chain them instead, in which case FoFiX might not play it correctly.

Original comment by raynebc on 15 Oct 2010 at 10:53

GoogleCodeExporter commented 9 years ago
Well, the feature is working pretty well from what I can tell. I already read 
the settings from the currently loaded OGG and generate the silence OGG 
accordingly. More testing is definitely needed, though. I'm almost done with 
the dialog so you'll be able to test it soon.

Original comment by xander4j...@yahoo.com on 15 Oct 2010 at 11:03

GoogleCodeExporter commented 9 years ago
I've implemented the Leading Silence dialog. The only problem left to solve is 
how to handle subsequent adjustments. Currently, if the user opts to have the 
notes and beat markers adjusted, subsequent adjustments will cause the notes 
and beats to be adjusted incorrectly. This is due to the original OGG being 
used for subsequent adjustments. We need a way to determine what the current 
amount of leading silence is so we can get the notes and beats into the correct 
positions.

Original comment by xander4j...@yahoo.com on 15 Oct 2010 at 11:27

GoogleCodeExporter commented 9 years ago
I think a variable to store the original MIDI delay would be a good way to do 
it.  Alternatively, an entire OGG profile could be created for the original 
audio file.

Original comment by raynebc on 15 Oct 2010 at 11:34

GoogleCodeExporter commented 9 years ago
For now I am using the length of the backup OGG compared to the 
silence-adjusted OGG to determine the correct adjustment. I'll leave this issue 
open until we do more testing.

Original comment by xander4j...@yahoo.com on 16 Oct 2010 at 12:19

GoogleCodeExporter commented 9 years ago

Original comment by xander4j...@yahoo.com on 16 Oct 2010 at 12:31

GoogleCodeExporter commented 9 years ago
I'm having some trouble with this. oggCat won't let me do the operation unless 
both OGGs have the exact same bitrate but oggenc only lets you specify the 
bitrate in kbps. This is kind of ridiculous since vorbis is VBR encoded and the 
bitrate is only an average. One file I tried was averaged at 64001bps and it 
wouldn't let me cat a 64000bps file to it (again, oggenc won't even let me 
specify 64001bps).

Original comment by xander4j...@yahoo.com on 16 Oct 2010 at 1:00

GoogleCodeExporter commented 9 years ago
Strange it's only picky with the bitrate when I use an OGG not made with oggenc.

Original comment by xander4j...@yahoo.com on 16 Oct 2010 at 1:58

GoogleCodeExporter commented 9 years ago
Finally some progress! First, I found that oggvideotools has a silence 
generating tool so I am using that instead of generating a WAV and converting 
myself. Secondly, I modified alogg (alogg_get_bitrate_ogg) to give me the 
nominal bitrate of the OGG file instead of the calculated average. This is 
giving me a bitrate that I can use with oggSilence to generate a compatible 
OGG. Just a few more things left and this issue is done!

Original comment by xander4j...@yahoo.com on 16 Oct 2010 at 2:46

GoogleCodeExporter commented 9 years ago
Everything seems to be working as expected. I'm still going to leave it open 
for more testing, though.

Original comment by xander4j...@yahoo.com on 16 Oct 2010 at 4:12

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Adding silence seems to corrupt the OGG sometimes. The one song I had trouble 
with was converted to MP3 from YouTube and then to OGG through oggenc. The song 
has a lot of artifacts in it due to the transcoding process so I'm thinking 
that might have something to do with it. I haven't had trouble with other OGGs. 
It's possible oggCat isn't doing a great job lining up the vorbis data.

Original comment by xander4j...@yahoo.com on 16 Oct 2010 at 7:50

GoogleCodeExporter commented 9 years ago
Despite NewCreature's experience with oggCat actually appending the data 
streams in the *nix build, in Windows, I'm seeing that oggCat always re-encodes 
the audio.  This happens regardless of whether oggSilence was used to generate 
the silent OGG or if oggenc2 was used.

Original comment by raynebc on 19 Oct 2010 at 6:35

GoogleCodeExporter commented 9 years ago
Nobody in the Theora channel knew why it wasn't working, so I emailed the 
ogg-dev mailing list with some sample OGG files that were created by oggenc2.  
With any luck, they'll be able to replicate an issue and provide a fix, or 
indicate what's being done wrong.

Original comment by raynebc on 19 Oct 2010 at 10:03

GoogleCodeExporter commented 9 years ago
The Windows binary for oggSilence crashes when trying to generate OGGs with a 
bit rate of 128Kbps or higher.  I'm communicating with the developer to see if 
he can determine why this happens, but if he cannot provide a resolution, EOF 
should probably continue using its current logic to generate a WAV file and 
then convert that to OGG with oggenc2.  This would probably be for the best so 
that the OGGs would be more readily compatible with OGGs it converted from MP3 
files in the New Chart wizard.

Original comment by raynebc on 21 Oct 2010 at 7:37

GoogleCodeExporter commented 9 years ago
Having a wave file creator will also provide a convenient means of implementing 
preview phrases (issue 168), as the samples in the preview range could be 
decoded to an array, a WAV file could be created, and oggenc2 could be used to 
create preview.ogg.

Original comment by raynebc on 21 Oct 2010 at 9:13

GoogleCodeExporter commented 9 years ago
There is some inaccuracy in EOF's current implementation (creating the WAV and 
encoding to OGG).  On an example chart of mine, the first tempo is 
119.521912BPM with a MIDI delay of 225.  After adding 200ms of silence:
The new MIDI delay is 462

After undoing that and padding to 400ms of silence:
The new MIDI delay is 415

After undoing that and adding 2 beats of silence:
The new MIDI delay is 1251

Even if I set the tempo on a chart's first beat marker to 120BPM and add 1 beat 
of silence, 515ms of silence is added instead of 500ms.  I do not know if the 
oggSilence method had this problem.

Original comment by raynebc on 25 Oct 2010 at 6:26

GoogleCodeExporter commented 9 years ago
r525 adds the untested eof_add_silence_recode function.  Using the re-encoding 
method of adding silence will result in some quality loss but would be the most 
reliable method, since the audio is decoded to PCM, worked on at the sample 
level and then encoded back to OGG.  This should avoid the two problems we've 
seen with using oggCat:

-Corrupted audio
-Incorrect resulting audio length (probably due to oggCat splitting the streams 
at the nearest possible location, such as OGG packet, instead of at the 
appropriate PCM sample)

Once this function is tested/debugged, it could be made a radio button option 
in the Leading Silence menu as to whether EOF should attempt a lossless 
operation (oggCat) or a lossy operation (re-encode).

Original comment by raynebc on 4 Nov 2010 at 11:01

GoogleCodeExporter commented 9 years ago
The inaccuracy is not really an issue since it is close enough to achieve the 
desired result. i don't think anyone would complain about a few extra 
milliseconds of silence. EOF detects this perfectly because it compares the 
length of the original OGG with the silence-added OGG.

If r525 doesn't have any issues then the only thing that's left is re-encoding 
from MP3 support. I think the best way to handle this would be to store the 
original MP3 in the chart's folder as something like "original.mp3" and check 
if that file is there when using the re-encode option. If it's there we can 
decode it to WAV, insert the desired amount of silence, and encode that.

Original comment by xander4j...@yahoo.com on 5 Nov 2010 at 5:30

GoogleCodeExporter commented 9 years ago
That sounds good to me.  I'm part way through changes to the Leading Silence 
menu to allow the user to opt for re-encoding, and should have it committed in 
the next few hours.

Original comment by raynebc on 5 Nov 2010 at 5:34

GoogleCodeExporter commented 9 years ago
r526 alters the dialog menu to offer this functionality, but 
eof_add_silence_recode() doesn't seem to be working yet.  I did find that it 
breaks something, preventing the waveform graph from being created as well.  It 
probably leaves a file path at an invalid value or something.

Original comment by raynebc on 5 Nov 2010 at 11:29

GoogleCodeExporter commented 9 years ago
I added a function to handle re-encoding with leading silence from the original 
MP3. It is currently broken due to what appears to be a bug in save_wav(). We 
still need to copy the original MP3 to the new chart's folder at chart creation 
time.

Original comment by xander4j...@yahoo.com on 6 Nov 2010 at 1:37

GoogleCodeExporter commented 9 years ago
I did a binary file comparison of the WAV file created by Lame (decode.wav) and 
the WAV file created by save_wav() (TEST.WAV).  Here were the results:

Comparing files decode.wav and TEST.WAV
0000001C: 10 BC
0000001D: B1 1A
0000001E: 02 1A
0000001F: 00 02

According to information I found about the WAV format, offset 0x1C (byte 28) is 
where the ByteRate is stored.  So it seems that only this 4 byte value is being 
incorrectly written by save_wav().

Original comment by raynebc on 6 Nov 2010 at 6:10

GoogleCodeExporter commented 9 years ago
We were both working on this simultaneously, but I think I fixed the rest of 
the bugs.  Now we should probably just look into the OGG encoding created by 
this method, as the encoded file is much smaller than the file created in the 
new chart wizard.

Original comment by raynebc on 6 Nov 2010 at 6:59

GoogleCodeExporter commented 9 years ago
The difference is because only the first half of the audio was being copied 
into the combined audio. r535 fixes this.

Original comment by xander4j...@yahoo.com on 6 Nov 2010 at 7:19

GoogleCodeExporter commented 9 years ago
I'm going to work on getting the original MP3 copied to the new chart folder at 
creation time.

Original comment by xander4j...@yahoo.com on 6 Nov 2010 at 7:21

GoogleCodeExporter commented 9 years ago
What do you think about removing oggCat support? I am leaning toward removing 
it because the re-encode options work well enough. I don't really trust oggCat 
to do what it is advertised to do considering it always gives warnings and 
sometimes produces corrupted audio.

I haven't looked at oggCat-produced files under a microscope but I wonder if 
the audio is subtly corrupted even if it sounds like the original during casual 
listening. With the re-encode options we don't have to worry about possibly 
corrupted files and not having that extra option will make it more 
user-friendly.

Original comment by xander4j...@yahoo.com on 6 Nov 2010 at 7:42

GoogleCodeExporter commented 9 years ago
According to the author of oggCat, much of the warning like output we'd seen 
are supposedly just notices, and he was going to remove that in a future 
release.  I think it's pretty safe to leave in as an option, since EOF will 
automatically detect for it and allow the user to use something else.  We can 
always remove it later, or make re-encode the default or something.

Original comment by raynebc on 6 Nov 2010 at 8:09

GoogleCodeExporter commented 9 years ago
One thing I've run into a few times is that if I don't undo a leading silence 
operation before exiting EOF (discarding changes), the original audio is not 
restored.  So when I re-open the chart, the chart is de-synced and I have to 
manually delete guitar.ogg and rename the backup file.  EOF should probably 
handle this during the "save changes before quitting?" and "you have unsaved 
changes" logic so that if changes are discarded, the altered OGG is deleted and 
the backup is copied to replace it.

Original comment by raynebc on 6 Nov 2010 at 8:26

GoogleCodeExporter commented 9 years ago
One other thing that had been requested was the ability to add a negative MIDI 
delay to a chart.  Since EOF can now add leading silence, adding a negative 
delay can be performed by inserting abs(negative delay) milliseconds of silence 
to the beginning of the chart.

Original comment by raynebc on 7 Nov 2010 at 8:38

GoogleCodeExporter commented 9 years ago
For the first of the remaining items for this enhancement, I'm not sure if 
there would be any method easier than storing a copy of the OGG that was active 
during the last save operation, saved as something like (oggname).lastsaved.  
When EOF discards changes, it can use the file comparison utility to determine 
if the currently loaded OGG matches the OGG that was loaded during the last 
save operation.  If it doesn't match, the last saved audio could be copied over 
the current loaded audio.

Original comment by raynebc on 8 Nov 2010 at 1:12

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Instead of doing comparisons we could just have an oggchanged variable that is 
cleared at save. If a leading silence operation is performed set oggchanged to 
1. When discarding changes we can just check this variable and copy the last 
saved OGG back over the current OGG if it is set.

We should only create a .lastsaved file if, at save time, a backup exists and 
the current OGG is a different length than the backup. This way we don't do any 
unnecessary saving. It might be a good idea to remove the backup if the user 
resets the leading silence to 0. If the user sets leading silence and doesn't 
save a .lastsaved won't exist so we need to copy the .backup if the .lastsaved 
doesn't exist at discard time and the oggchanged variable is set.

Original comment by xander4j...@yahoo.com on 8 Nov 2010 at 5:25

GoogleCodeExporter commented 9 years ago
That would probably work, but oggchanged would probably have to reset to 0 if 
the user performs "load OGG".  Everything else sounds pretty good.

I was thinking about the negative MIDI offset feature, and it would involve 
many things:
1. Having eof_menu_song_properties() check for a negative value and launch the 
Leading Silence dialog with eof_etext already populated with the amount of 
silence to add
2. Have the user confirm the operation by clicking OK
3. Modifying the audio
4. Offsetting all beats, notes, phrases, etc. by the appropriate amount

Let me know if you think this feature is worth adding or if it's too messy to 
bother with.

Original comment by raynebc on 8 Nov 2010 at 5:41

GoogleCodeExporter commented 9 years ago
I would rather not support negative delay. The user that wants to use a 
negative delay is really just wanting to add leading silence so they should 
just use that.

Original comment by xander4j...@yahoo.com on 8 Nov 2010 at 6:20