About future development.

chaserhkj commented 11 years ago

嗯...我知道这是很不规范的做法,Issue Tracker是用来Track Issue的不是用来当论坛聊天的...但是我在iVocaloid论坛上没有发帖权限...直接给开发者发邮件又怕被垃圾邮件过滤...于是我就到这里来发了...

嗯..首先,我觉得自己可以算半个程序员了..学编程大概学了两年左右吧..会C/C++/Python/Go/Javascript, 对Linux和各种开源软件体系都比较熟悉...嗯嗯这是自我介绍了...

然后...我觉得Sleepwalking桑你这个项目做的很棒啊!!! 其实我早就有用初音调教中文歌的想法,但是碍于完全不了解语音学而一直都做不了什么,而且对怎么做逆向工程也是完全不知道所以也搞不定Vocaloid...也是因为平时事情很多,没大块的时间...于是呢,现在我希望能参与这个项目合作...论坛上看到你说GUI和C++苦手...我恰好这方面强一点可以帮忙做做前端开发 ...当然我是觉得我完全做不了后端了(笑)

嗯现在肯定是有这么几个建议:

1.建议还是用C++做开发...C++封装性好,语言相对比较直观,方便做前端开发...不管是开发效率还是运行效率,都相对高一些...其实我是想能用Python做前端肯定最方便..但是出于跨平台考虑,Python要部署Windows 运行环境略坑...

2.建议不要用WxWidget做GUI,改用Qt吧...Qt比起WxWdiget要易学易懂的多...乃说乃学C++时被MFC的Hello World吓到了...其实WXWidget和MFC风格是一样的...而MFC的反人类的API复杂程度世人皆知...与WXWidget比起来Qt就容易学得多...而且Qt也跨平台... 我最熟悉的GUI编程也是用Qt编....

3.关于开源软件协议的事情...我觉得有必要提醒乃一下GPLv3是支持商业使用的..... GPL只是禁止商业公司把代码拿去做闭源软件...如果商业公司拿去修改了之后继续开源,甚至拿来卖钱,只要他提供源代码,那都是不违反GPL的...但是GPL允许散布软件,就是说商业公司拿去卖钱的GPL软件,用户买来后拷贝给别人,或者放在网上分享都还是完全合法的...就是说GPL并不是不允许商业使用...他只是让这个软件没有了被商业使用的意义...另一方面,禁止商业使用并不是开源软件精神推崇的...如果你真的想禁止商业使用的话请不要使用GPL...考虑CC协议吧 ...

<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year>  <name of author>

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.

嗯 ...基本上就这样...期待能一起合作吧(虽然肯定只能等到期末考试后暑假时间才有时间码代码的说(:3L))

Project-DeltA commented 11 years ago

我是第一个来Github的吗,总之先支持一个

Sleepwalking commented 11 years ago

chaserhkj，十分感谢你的关注和支持！现在Rocaloid编辑器正好缺人。

我们已经决定用C++重写代码。
Qt在GNOME下似乎有一些兼容性问题，而WxWidget在各平台兼容性表现出色。
谢谢提醒。但我已经不忍心把Rocaloid商业化了……
好的马上去办……

期待暑假合作。我们的开发qq群：255842311

Chaserhkj, thanks for your attention and support! And we just need someone to write the Rocaloid Editor.

We have already decided to rewrite Rocaloid in C++.
It seems that some compatibility problems occur with Qt under GNOME, while WxWidget acts well in many platforms.
Thanks for advices. But I could not bear to make Rocaloid commercialized...
Well, I'm going to fix that soon...

I'm looking forward to cooperating in summer holidays. Here is our QQ group for development: 255842311

【感觉在国外的网站上还是用英语比较好……而且我要多练习英语……

m13253 commented 11 years ago

我觉得 Rocaloid 的未来不仅仅是提初音ミク的音源，希望能支持自制音源。而且希望将来能不仅仅支持中文（需要在整个程序框架设计上下一些功夫啊）

chaserhkj commented 11 years ago

@Sleepwalking 关于GUI工具包的问题，我还是倾向于使用Qt。我可以列举如下的几点要点：

1.据我所知，Qt库提供了原生的GTK样式支持，暂且不提自己的样式系统都没弄清楚的Qt5程序，Qt4程序在Linux下各桌面环境都有较好的兼容性...至于你说的GNOME上的问题好像自Qt4.3以来就解决了..

2.Qt不单单是一个GUI库，Qt是一整个应用程序框架，提供了包括底层数据结构，文件IO，网络通信，多媒体处理，HTML/XML 解析，SQL数据库支持，脚本解析框架，插件框架，当然还有图形界面在内的许多功能，在这里，我们最需要注意的是多媒体处理和插件框架。首先在Rocaloid的建构上，多媒体处理将是一个非常重要的环节，这里我说的处理并不是语音合成这一部分，这一部分应当是我们自己做的...我所指的是一些音频格式的解码（作为BGM）和音频的播放。如果使用VB，我们这些工作很多都是可以通过调用Windows系统库完成的...但转到C++时，又要考虑跨平台的问题，就不得不选择一个跨平台声音库来使用了...另一方面，考虑软件的长远发展，提供插件功能是必要的，但由于各平台下系统API的区别，这又是一个非常麻烦的跨平台问题..但是Qt提供了一个跨平台的插件框架立刻能解决这个问题。

3.Qt的文档是我所知的任何一个GUI工具库中最为详尽的...这使得Qt程序的开发比起许多其他GUI程序都要轻松的多。

所以，请慎重的考虑是否要使用Qt的问题。当然，我所述的一切都只是出于我个人看法的建议，如果你坚持要用WxWidget...我也码得了代码..而且我也会尊敬项目领导者的意见 = w =

BTW，QQ群已加...但是实在不怎么用QQ，重要事情还是通过Github、邮件或者GTalk联系都行...邮件地址我Github信息页里面有..

祝好 Chaserhkj

More about the GUI toolkit, I would say that I still insists that we should use Qt. I could list certain reasons for that:

1.According to my knowledge, Qt library has native support for GTK styles, thus except for Qt 5 applications whose own style system is not very ready, Qt 4 application is compatible with most of the DEs in Linux environment... As for the compatibility problem you mentioned, it seems to be solved ever since the Qt 4.3 release...

2.Qt library is just more than a GUI library, it is a framework for building applications providing lots of features including low level data structures, file I/O, network access, multimedia processing, HTML/XML parsing, SQL database support, script parsing support, plugin framework support and of course, Graphical User Interface support. For our Rocaloid, we should note on multimedia and plugins , Firstly, multimedia processing would be a great component for Rocaloid. I'm not talking about generating voice, that's what we should implement ourselves. I'm talking about audio codecs and audio playbacks... Currently, as we are using vb coding, we can use Windows APIs and Windows libraries to achieve all this. But as we are planning to port this to C++, it would be a cross-platform issue to choose a audio processing library... On the other hand, considering the future development of our application, we should surely implement a plugin system for extending our Rocaloid, which would be a cross-platform problem due to API difference across different systems. Yet Qt just provides a cross-platform framework for building plugins.

3.Qt has the most verbose documentation among all the GUI tool kits I've known, this reduces many pains to develop a Qt application.

As a conclusion, please think carefully about using Qt for developing GUI. Of course, what I suggest here only represents my opinions, if you guys insists using WxWidget... I can still do codings and I would surely respect you project leaders' opinions.

BTW, I have joined the QQ group, but as I do not use QQ very often, please still send important messages via Github, Email or GTalk... You can get my email address through my github profile page.

Best wishes, Chaserhkj

m13253 commented 11 years ago

弱弱地插一句，说不定你会对 Pascal 语言感兴趣。从 VB 转 Pascal 有的时候会比较轻松。可以试试看 Lazarus 这款 IDE，是 Delphi 的开源仿制品，不仅仅是 IDE，还自带全套库（称为 LCL--Lazarus Component Library），创建控件和 VB 一样简单。 Lazarus 的口号是 Write Once, Compile Everywhere. 可见其可移植性。当然，从长远发展来看，我还是建议用成熟得多的 Qt 库。我非常不建议使用 wxWidget，因为它只能完成 GUI 部分，对于音频回放、配置文件处理、数据库提取等需要单独去实现。如果您执意使用 wxWidget 我也无法阻拦啊。我不使用 QQ，如果有什么重要事情，请邮件给我。我的邮件地址在我的 GitHub 个人资料页。

Best regards, Star Brilliant

Sorry for my interruption, you may be interested in Pascal. It may be easy to switch from VB to Pascal. You can have a try at Lazarus IDE, which is a open-source duplicate of Delphi. Not only is it an IDE, it also includes a whole suit of library called LCL--Lazarus Component Library. It is similar to create components as easy as in VB. The slogan of Lazarus is Write Once, Compile Everywhere where you can see the portability of which. However, considering the future development of Rocaloid, I still suggest Qt which is much more stable. I think wxWidget is deprecated, for it can only do the GUI. It is needed to implement audio playback, parse of configuration file, database queries separately. Anyhow, I can not stop you from using wxWidget. I do not use QQ. Please send me mails if there is something important. My mail address is on my GitHub profile page.

Best regards, Star Brilliant

rgwan commented 11 years ago

从预览音频直接输出的角度来说，直接调用PulseAudio可能更好。不过音频编解码这一点我就没有考虑到了，失策。因为原来接触过不少使用wx的程序。听了各位的建议，我想我们还得再思考思考。

In my opinion,to output preview audio stream,maybe PulseAudio is better than other library. But I haven't think of audio encode/decode. it is my failure. I have use some program using wxWidget,and written some program using wxWidget....so I think wxWidget is easy to use . Of course wx included some awful feature. I think we need brain storm to solve this problem.

FrankHB commented 11 years ago

Strongly against for tight coupling with any GUI framework. If needed, just grant trusts to others to do the GUI stuff.

m13253 commented 11 years ago

FrankHB wrote:

Strongly against for tight coupling with any GUI framework. If needed, just grant trusts to others to do the GUI stuff.

I suuport. The core engine must be separated from the GUI frontend. The typical model is that the GUI frontend output a intermediate file as the input of CLI engine. Everything happens automatically. So this enables others to develop a different GUI for your engine.

Sleepwalking commented 11 years ago

Good idea. I agree. So we should focus on the engine and parameter generator and let someone else open up another repository for the GUI. And I think a CLI based engine would result in flashing command line window after you pressing 'synthesis'... bad experience. So we'd better provide both CLI based engine and the dynamic link library version.

m13253 commented 11 years ago

On 2013-6-14，16:09，"Sleepwalking" wrote：

And I think a CLI based engine would result in flashing command line window after you pressing 'synthesis'... bad experience.

Didn't you know that in Linux, a terminal window will never show up unless you ask to?

Didn't you know that in Windows, there is a SW_HIDE option for CreateProcessW and WinExec?

m13253 commented 11 years ago

On 2013-6-14，16:09，"Sleepwalking" wrote:

So we'd better provide both CLI based engine and the dynamic link library version.

However you should be aware that, if your engine crashes, external-process model will protect your editor and your work; if you use dynamic-library model will leave you nothing in case it crashes.

Let us take VOCALOID and UTAU as examples.

The core engine of VOCALOID is in a DLL. I often experience freeze if I write too complex phonetic symbols and I can even do nothing to stop the playback! (Hitting the stop button will not stop the render until the current rendering note fully rendered.)

However, if I make a note too short in renzoku mode in UTAU, the render process will crash as well. But the editor is safe! And the most exciting thing is, I can even render in batch thanks to command-line interface!

As to the console window or terminal window you just have mentioned, have you noticed that when you hit Compile, your IDE is actually executing gcc, cc1, as, ld in order in the background (or CL, LINK etc if you use Microsoft Visual Studio)? Most user will probably not even be aware of this!

What about a media converter software like avidemux? It calls ffmpeg (or loads only the library of ffmpeg which is called libavformat.so / libavformat.dll if I remembered correctly) and user will never notice that. The opposite is Ulead VideoStudio, which often freezes when I encode 1080p video, only because it renders in the same process of the editor.

chaserhkj commented 11 years ago

@Sleepwalking Flashing command window issue was definitely due to incorrect calling of external program or due to incorrect flag passed to compiler when building...

Invoking system calls from frontend program can surely smoothly do the job

m13253 commented 11 years ago

@chaserhkj wrote:

Flashing command window issue was definitely due to incorrect calling of external program or due to incorrect flag passed to compiler when building... Invoking system calls from frontend program can surely smoothly do the job

As you see, @Sleepwalking has his mind stuck to Windows.

Quote the comment I have just sent a few minutes ago:

Didn't you know that in Linux, a terminal window will never show up unless you ask to? Didn't you know that in Windows, there is a SW_HIDE option for CreateProcessW and WinExec?

Sleepwalking commented 11 years ago

Ok... You win... I admit that I stuck my mind to windows... I'm learning linux these days...

Is there any way to share memory between different processes? I'm also afraid of frequent IO on disk, like what Utau does.

m13253 commented 11 years ago

@Sleepwalking wrote,

Is there any way to share memory between different processes? I'm also afraid of frequent IO on disk, like what Utau does.

If you have tried X11 (it is currently a must-have component of Linux), you will know how to do it. X11 uses both socket/pipe and shared memory. I don't recommend shared memory since it is implemented differently in Windows and in Linux though shared memory is much much faster than pipe and socket. In Windows, shared memory is achieved with WM_COPYDATA messages.

I recommend using pipe. What is pipe? When you execute dir | more in command prompt, you are using pipe. In Linux, we have other pagers besides more, such as less, w3m, most, or even the text editor vim - (with an argument -). No matter which pager you choose, the program dir or ls will not behave differently. So the advantage of pipe is that, if one side changes to another compatible program, they will still work. It is true in Rocaloid because someone else may write a new graphical front-end to replace the current one.

Another reason I discourage using shared memory is that if you write another GUI front-end to work with it, unexpected results may appear. Use shared memory if only you are copying gigabytes of data.

In Windows, you can use CreatePipe and in Linux, you can use pipe to create a pair of pipe.

Do you know why UTAU is doing disk I/O? It saves temporary WAV file on disk, one note per file. I believe that it is unnecessary. You can do it better, without using too much disk I/O.

m13253 commented 11 years ago

I recommend the model of Kdenlive and Melt. Kdenlive is a video sequence editor. Melt is a video renderer and it can call ffmpeg to encode the video.

When you finish a project and hit Render in Kdenlive, it will export a short script which Melt can understand to a temporary folder. Then Kdenlive calls Melt. Melt keeps printing the current percentage and Kdenlive use pipe to read it and show it to user. Also, Melt calls ffmpeg to encode the video to a format you will probably know, such as mp4 or avi. The advantage is that, if someday, Melt changed to use GStreamer to encode the video, Kdenlive will not behave differently. This is unachievable with dll or so. Every program does its own job and does it well. It is the spirit of UNIX and Linux. You can choose not to use Linux for daily life, but you may not choose to refuse the spirit. It will help you much.

FrankHB commented 11 years ago

I suggest exposing the core engine API and building both CLI and GUI as front-ends based on this API layer. There are 2 different points.

How to make multiple components work together. This is a problem about architecture design. The classic pipe and filter architecture might be fit here. But note, multi-process architecture is harmful to reusability and user experience if not carefully designed. It might take much time to do the right things. Not friendly to rapid prototyping. DLL is OK to eliminate paths of the data communication, even within a multi-process system. At least statically loaded modules are much simpler than IPC. If you want to make the intermediate data be processed by 3rd party programs, surely better make the format publicly available and use many processes. That's true for the GNU toolchain. But not sure whether it should be here just now. Otherwise, you can make your private data transfer scheme between processes. This is still somewhat complex, even for plain text(parsing needed). It seems relative easy to split the system as multi-process in future. It is a plus for protecting users' work on failure. However, it should not be considered as a common scene. The protection can be achieved through other means, e.g. automatically backup.
How to interact with end-users(i.e. UI). CLI is UI. It is not used to resolve the problem above. If end-users are not intended to often care about command lines, CLI is not necessary. Though I think we can have both CLI and GUI eventually.

m13253 commented 11 years ago

Reply @FrankHB

You are not thinking of the real situation of Rocaloid. Rocaloid has two parts: the editor and the renderer.

The editor should just edit the project files. And when it's finished, user can either press the 'Export to WAV...' button to automatically execute the renderer in background, or to close the editor and open a terminal to call the renderer. (especially useful for batch render)

Although the first way is more common, I think that the editor should be separated from the renderer.

The editor has nothing to do with the renderer. If the user ask to preview a small segment of the whole synth project, the editor can just execute the renderer to render the required seconds or minutes.

They do not have to communicate too much.

If we implement it as dynamic libraries, we will be unable to call the renderer individually.

chaserhkj commented 11 years ago

@m13253 Seems that you are trying to apply some Unix philosophy to our project, that's fine. I fully agrees with your recommendations.

On the other hand, I agree with @Sleepwalking 's recommendation on building shared libraries, too. This is due to the consideration on the of the expansibility of our engine. Providing shared library bring embedding our engine into other applications possible, which would surely be great.

So, I think that our application may have three parts:

A shared library as the backend engine.
A CLI frontend linking to the shared library.
A GUI calling the CLI frontend as backend.

This is just like Kdenlive, which has a QT-based gui frontend, calling ffmpeg cli tool to render, and which is exactly a CLI frontend to libavutil, libavcodec libraries....

This also leaves space for other GUI developers who prefer to use C/C++ library APIs as communications between frontend and backend rather than pipe or something.

On the other hand, you decided not to depend on Qt for backend, but I still strongly recommend that we should implement a plugin framework for the rendering engine to be extended by other features. We can either reply on the system calls to implement this ourselves, or depend on other open source, cross-platform frameworks.

m13253 commented 11 years ago

Agree with @chaserhkj Maybe I was expressing my idea inaccurately. My thoughts of the application is lib+CLI+GUI model.

I don't think we should depend on Qt for backend. I have not been using Qt for development, but I have tried Gtk. If you use such a framework, something of the overall framework of the application would change... I can not express my idea precisely. Anyhow, I do not recommend using Qt for backend, but I strongly recommend Qt for frontend since it is highly portable.

FrankHB commented 11 years ago

@m13253 There is no contradiction to implement the core API in (dynamic) libraries and keep the editor separate from the renderer. I have not said there should be only one executable program. I've emphasized the call path, which means the back-ends are not necessary to be always run as separate processes. The library provide API allowing developers to make program with UI when needed. Both editor and renderer seen by end-users can be thin wrapper above these libraries. The only responsibility of them are to provide UI. But the editor do not call the renderer directly. Users have freedom not caring about whether they are using a separate process or not when the renderer are called in the first way. For the second way, if opening a terminal to call the renderer is feasible, opening a GUI application to do the same thing should be also feasible. How to make it implemented conveniently for others without API exposed?

Sleepwalking commented 11 years ago

Have you thought about real time rendering? If the user had inputted a super long set of notes in the editor, I guess it would take some time to process the intermediate files between the editor and the engine. A CLI based engine would be difficult for real time rendering... You may achieve that through pipes but it would be much more difficult to program than using shared library.

m13253 commented 11 years ago

Count five floors above:

If the user ask to preview a small segment of the whole synth project, the editor can just execute the renderer to render the required seconds or minutes.

However I am convinced that there should be a shared library. Thank you for your patience.

chaserhkj commented 11 years ago

@Sleepwalking I don't think using @m13253 's infrastructure would cause much problem on performance or coding... Cause this way of building a system by building a lot of small components and connecting them with pipe, sockets is highly mature on many platforms and are greatly supported by nearly all platforms' APIs and features. This is just a preference, or a so-called philosophy of coding. It won't be complex.

Yet on the other hand, this "Unix-style" of building a system has a highly flexible structure,as every component of the system can be separately take out to use, thus our stuffs can be highly shared with other project, or mixed with others, or have multiple frontends, while we could provide the users with an "official" options of components to build a system so a installation can work out of the box, leave the freedom for those users who would like to choose would surely be great.

Sleepwalking commented 11 years ago

@chaserhkj Well, I have no more ideas about this topic since I haven't used the pipes or sockets...

We should quickly reach an agreement. I have too many things to do this summer...

m13253 commented 11 years ago

I aggree that: a lib + a CLI + a GUI. Just as @chaserhkj said 7 floors above this floor.

digited commented 11 years ago

HI Sleepwalking and other developers,

I'm making Qt4 GUI for resamplers - QTau, UTAU-like editor that can use any resampler. Coding was started by Tobias Platen here: https://gitorious.org/lauloid Author is making resampler based on libsms (Spectral Modelling Synthesis from Pompeu Fabra university of Barcelona), and another one based on "World" (Sekai).

That is, if you need UTAU-like GUI for your synthesizer, I'm making one, and as all Qt applications, it supports localization to any language and is cross-platform (works on Windows and Linux atm).

m13253 commented 11 years ago

@digited Thank you for your project. I am not a contributor of Rocaloid (Sleepwalking is), but I would like to help translate your QTau. I have searched in the database but have not found where the .po or .ts stores. Please give me instructions on how to localize your project.

P.S. sorry I don't have a Gitorious account so I leave my message here.

Sleepwalking commented 11 years ago

@digited It's very kind of you to provide the QTau editor for Rocaloid. We've partly finished rewriting Rocaloid in C++ but a decision has been made to pause the process of rewriting and we planed to develop a better synthesis engine instead. The current C++ version of Rocaloid is already able to synthesize (with minor bugs). And we are going to use the same script format for our new engine. So it's nice to collaborate and doesn't matter if the engine is updated.

But I'm not sure about the compatibility between QTau and Rocaloid. The conversion from notes to speech goes through two steps in Rocaloid and its synthesis engine heavily relies on parameters (and requries a massive database currently).

I hope Rocaloid can have its own GUI in the future, but it's OK to use QTau or other GUI currently since we don't have a GUI until the new Rocaloid is finished(which might be half a year later).

digited commented 11 years ago

@Sleepwalking

its synthesis engine heavily relies on parameters

What parameters? What controls and settings do you need in GUI frontend?

Also I plan to support interface extensions for QTau, so that synthesizers loaded as Qt plugins (crossplatform dll's) can extend user interface with special ui controls that they need. Everything is possible.

QTau is fully open source and fully free, and is a custom software, so anything is possible. Just please describe what do you need from GUI.

Sleepwalking commented 11 years ago

@digited

What parameters?

CVS & RSC structure. CVS describes the phonetic details of syllables, such as phonemes, durations, envelope, etc. RSC is like the .vsqx or .vsq files in Vocaloid, or *.ust in Utau. Rocaloid Engine is mainly composed of two parts, the CVE synthesis engine and RSCCommon, which is a converter of RSC and CVS.

What controls and settings do you need in GUI frontend?

I guess the easiest (not the best) way is to start RSCCommon first and then start CVE. You can pass some basic parameters(lyric & notes) to RSCCommon and let it do conversion for you. But this way limits your accessibility to CVS, which is crucial to produce a better song. You may embed RSCCommon to your editor or make changes to the *.cvs file produced by it and that requires much more works to do. Specifics for RSC and CVS files are easy to learn and their are stored as plain text, easy to read/write. I've described those file formats here: http://bbs.ivocaloid.com/thread-115484-1-2.html http://bbs.ivocaloid.com/thread-115503-1-1.html

digited commented 11 years ago

I can't read Chinese, sorry.

How can I build and test your synthesizer?

Sleepwalking commented 11 years ago

@digited Sorry for Chinese...

The newest code is at rgwan's fork: https://github.com/rgwan/Rocaloid The current version is developed on Ubuntu with Anjuta. And the repository is actually an Anjuta project (I used Anjuta 3.9.1). It hasn't been tested on Windows yet. rgwan said he had tried to compile it with MinGW. The compilation was successful but a runtime error occurred, which said a shared library was missing. I guess static compilation could solve that problem. By the way, the original .net-based version is in my Rocaloid 1.6.0 branch. It has same functionalities as the C++ version, the only difference is speed (C++ version is 6 times faster...).

Feel free to ask us any questions about Rocaloid.

@rgwan The command line arguments are your part, please describe them.

Sleepwalking commented 11 years ago

Wait for a moment... Some codes are not merged yet. I'm dealing with them now.

Sleepwalking commented 11 years ago

@digited Everything is ready. I've merged all the features into my repository. In addition, the "minor bugs" I mentioned were removed. The CLI for CVE (cvecli) is finished but the one for RSCCommon (rsctool) is not. There is a simple snippet for loading, converting and storing those file formats in /RocaloidEngine/src/main.cc

You need a sound database and a dictionary to run Rocaloid, which were attached with the .net version: http://pan.baidu.com/share/link?shareid=540236&uk=3423845838

Here are some cvs & rsc files for test: http://pan.baidu.com/share/link?shareid=3408246916&uk=3423845838

digited commented 11 years ago

15 kb/s (with my 100 mbit/s connection), and it was cancelled near the end, retrying. Downloading will take some time.

digited commented 11 years ago

Downloading is cancelled for 4th time, continuing isn't supported. Can you please upload those big files to services outside Chinese net segment?

Sleepwalking commented 11 years ago

It's a little bit hard... We're at the same situation... I'll try. Uploading to RapidShare, at a relatively slow but steady speed... It's that OK?

Sleepwalking commented 11 years ago

http://rapidshare.com/files/968788023/RocaloidCoreVer1.6.7z

digited commented 11 years ago

@Sleepwalking Downloaded in a flash, thanks. WIll try soon.

m13253 commented 11 years ago

Since you have .NET version, why don't you try to port it to Mono, I just wonder?

Sleepwalking commented 11 years ago

Most of the contributers of Rocaloid use C/C++.
.NET cannot reach the maximum efficiency, especially when running under Mono.
I want to learn C/C++ and Linux...

digited commented 11 years ago

QTau looks like this this now:

qtau_progress

Needs some more work before publishing. Hope to finish proper gui demo this week. QTau is licensed under WTFPL (http://www.wtfpl.net/about/), so you can use it for any purpose and in any way you like (I'll keep working on it of course).

upd license changed to BSD to avoid offending some fragile souls (that doesn't affect Rocaloid in any way). upd2 obviously no demo "this week", things are changing as I go. It's great that I don't work alone on it now.

digited commented 11 years ago

Both UTAU and Cadencii (http://vocaloid.wikia.com/wiki/Cadencii) use resamplers as external processes, which requires resampler to do this every launch of synthsizing process:

read and parse oto.ini (voicebank description)
check for all the .wav of voicebank and load them
write output .wav somewhere
editor to read ouput .wav

While there's no reason not to add same functionality to QTau, I'd prefer to use Qt plugins - cross-platform dynamic libraries with extended functionality.

Qt plugins may supply GUI widgets and extend interface of editor (main menu, toolbars, and anything generally)
Since they use Qt and have same container classes and structures, plugin may receive an already parsed oto.ini (or any voicebank data ready to be used)
I'd prefer to use special audio cache object to effectively cache .wav files of voicebank, so that they are read from disk only once, while cache doesn't exceed maximum RAM volume in editor settings.
launch synthesizing (resampling) in separate thread to avoid freezing user interface of editor, and in try {} block to avoid crash on resampler' errors, if they happen
because it's a dynamic library, receive back resulting synthesized/resampled PCM in RAM, without writing it to disc and reading then by editor.

Optimizing disk i/o and extensibility can give QTau an edge over both UTAU and Cadencii, I hope. (besides UTAU being stalled since 2011 and Cadencii requiring .NET/mono or Java, that is)

Qt plugins are compiled with g++, so plugin may wrap any code in C/C++, it can be just a special manager class to your already existing C/C++ synthesizer (even if you use STL, Boost and other non-Qt utils).

http://qt-project.org/doc/qt-4.8/plugins-howto.html

digited commented 11 years ago

Source code for QTau is here, should work for Windows and Linux.

Sleepwalking commented 11 years ago

All right. I'm going to check it out.

tuxzz commented 11 years ago

我或许也可以帮忙做一下前端，我的想法是用python+pygame(SDL)来做（我用pygame做过一个比较完善的GUI库）。我建议你把引擎封装成一个DLL,然后开放一些高层接口和底层供python调用。大概就是这样。 btw,我是英语+数学苦手

tuxzz commented 11 years ago

哦对了，关于音源制作问题我的想法是通过国际音标（IPA）和本地文字（假名，罗马音，汉字，拼音等等）的字典来发音。到后期比较完善时我可以提供正太兼大叔音源供测试对这项目很看好!

m13253 commented 11 years ago

我和 Sleepwalking 的想法是用 S-SAMPA 音标。（Sleepwalking 开始还很担心会不会导致 diphone 组合过多，后来决定做一个本地语言到 X-SAMPA 的兼容层，具体情况还得问他）使用 X-SAMPA 的好处是发音记号兼容 VOCALOID、mbrola 等著名软件。

m13253 commented 11 years ago

我也很熟悉 Python。不过我们决定用 Qt 和 C++。不过我们已经有一个编辑器了。

Sleepwalking / Rocaloid-old

About future development. #1