Open GoogleCodeExporter opened 8 years ago
Same problem on the iPhone.
BTW, the device can only play 720p videos because the decoding is done on
dedicated hardware. Video decoding is very processor intensive, especially due
to the
fact that ffmpeg is poorly optimized for embedded devices.
Do you plan on doing any performance profiling? I'm curious to see what
specific
parts of the code is an issue. I have not found rendering to be /that/ big of a
bottleneck, but rather the decoding of the input streams and general
un-streemlined
interaction between threads seems to take up most of the CPU time. Even so,
I've
gotten it to run at about 50% speed, and there doesn't seem to be anything more
to
specifically optimize.
In the long run, it seems a fork/rewrite is in order, rearranging everything to
make it
more thread-friendly and stripping out SDL and possibly libcurl completely.
Original comment by niel...@gmail.com
on 6 Jan 2010 at 2:48
Yep yep, I know the decoding is done by dedicated hardware, I've worked on the
gst-
dsp elements (a little) so I know.. :) but I also know the decoding shouldn't
be an
issue because the N800 and N810 (330MHz CPUs iirc) were doing video decoding on
the
CPU, not on the DSP (the DSP was used for the decoding of audio) and it was
fine.
Also, if you install mplayer on your N900 and try it out, it should work very
good,
and that's using the same ffmpeg as ORP uses.
If I find some time later today, I'll try to run oprofile on that and see what
happens
I definitely agree with the rewrite/refactoring. I was totally shocked to see
the
makefile and the forced compilation of dependencies (zlib, libpng, freetype,
curl,
faad2, SDL, ffmpeg, openssl, wxWidgets, etc..) although they are already
available
on the platform.. and I didn't like the static linking of everything either..
no ./
configure, no Makefile.am.. it's far from a standard source package...
Also, the UI definitely needs some work because I had to disable some fields in
the
edit profile window just to be able to see the save button.. no scrollbar...
I'd suggest using gtk for the new UI, this way it wouldn't require much work to
port
it to a hildonized, small screen/high dpi, finger friendly UI...
I would also suggest maybe also using gstreamer for the decoding/rendering as
it
would make it so much easier.. and using gstreamer would allow ORP to use the
DSP
hardware simply by letting gstreamer choose the best h264 decoder for the
stream
(gstdspvdec instead of ffdec_h264). I don't know the internal workings of ORP,
but a
simple gstreamer pipeline could be built with 10 lines of code : "appsrc !
video/x-
h264,width=320,heigh=240 ! decodebin ! xvimagesink" and ORP could just feed it
raw
data through the 'appsrc' video source.
no SDL, no ffmepg involved, it's magic! :)
Original comment by snifikino
on 6 Jan 2010 at 4:17
Hey nielkie,
I did a little test of oprofile. I was able to get oprofile to run on the N900,
and
I got the debug symbols for the kernel and for orp, and I did a little test.
This isn't much because I'm not at home at the moment, so I couldn't connect,
so the
only thing this opreport result shows is for open the UI, clicking on launch,
and
waiting for about 10 seconds while it tries to connect, then I stopped it.. it
was
mainly done to test whether oprofile was working correctly or not, and also
because
the whole 'fade in' image when orp tries to connect seemed slow too, so it
would be
nice to profile that too.
Attached is the file, have a look if you're curious!
I checked your iphone port issue, it's nice to some optimization done, and I'd
be
interested in seeing a refactoring or a fork that would use better
technologies. I
tried to check your patch but it was too big and I don't have much time, so
maybe
you could quickly explain it to me, saying how it improved the performance and
tell
me if it would be safe to use that or if it contains iphone specific stuff (I
saw
#ifdefs though).
Original comment by snifikino
on 6 Jan 2010 at 9:48
Attachments:
Hi again, here's my oprofile results for a simple remote play session.. I
actually
tries it a few times since oprofile is a statistical tool, so the more samples
we
get, the better results we have...
The first one, I opened the ui, launched the game, waited quite a while, and
used
left/right to change the view, and have sound played when i was over a game's
thumbnail. The second one lasted less time (it got an error about corrupted
stream
so it stopped early), and i didn't do much with it, just opened it without
sound...
the third and fourth reports are also 'idle' ones but with the cursor on the
game,
so we receive sound (the game's sound when highlited in the XMB).
The 4th report is important because when I took, I had just installed the
libc6-dbg
amd libstdc++ debug packages, so we can see which calls are being made in libc
and
libstdc++...
We get about 40% CPU on ORP, 35% CPU on the kernel, 8% in libc calls, then some
more
cpu for pulseaudio, the FB driver, and some nokia voice driver.
I also attached the result of 'powertop', which can be interesting.
As you can see most of the CPU is being used on the H264 decoding.. there is
also
some CPU needed for faad, but what worries me is all the time needed for
scheduling
the threads, as well as inside libc, which is mainly memcpy and memset calls!
This
means the code really needs to be optimized in order to reuse buffers instead
of
copying data over and over again...
All this copying is also causing a huge amount of CPU to be used for DMA.. look
at
the mcspi calls in the kernel!! The mcspi is the driver for DMA (google it),
which
means that all those memcpy calls are having a huge impact on performance
because of
the memcpy CPU, and the DMA...
I think that if the code gets optimized to avoid any unnecessary memory
allocations
on the critical path, as well as memsets and memcpys, then we should have a
much
better performance.. then fix the threads to act more nicely, and finally try
to get
that H264 decoding off the CPU and onto the DSP will make ORP run smoothly on
the
N900!
I hope this is helpful, and if you guys need some more profiling done or other
kind
of information, let me know!
KaKaRoTo
Original comment by snifikino
on 7 Jan 2010 at 6:23
Attachments:
Wow! Great stuff posted here... Particularly the mention of GStreamer. I've
heard
of it but have never looked at the API. It sounds like switching to this will
eliminate most of the bottle necks that are causing grief... AND GStreamer
clocking
support looks like my a/v sync issues will just go away :)
I'll create a GStreamer branch and see how that goes...
Original comment by darryl...@gmail.com
on 11 Jan 2010 at 10:36
Any progress on this?
I got ORP packaged on the official repos and would really like to see a switch
GStreamer (which would use the DSP)
Original comment by mohammad...@gmail.com
on 5 May 2010 at 1:58
Yes, well, nothing public yet due to lack of stablility. I have an experimental
branch working with GStreamer, but it's far from feature complete. When I have
more
time I'll come back to, and release it.
Original comment by darryl...@gmail.com
on 5 May 2010 at 2:13
If you need help or something, let me know, I might try to use some of my free
time to
have a look at your code and help you out with it. Or better yet, if you have
specific
issues or GST_DEBUG logs, I can take a look and try to figure out what's wrong.
I'm a GStreamer expert working for Collabora (the company behind GStreamer), so
feel
free to ask :)
Original comment by snifikino
on 5 May 2010 at 3:06
Awesome! That's great to hear. I can already think of a few questions but I
won't
bother you with them now. I should have time to continue work on the GST
branch in a
few weeks. I must say I'm *very* impressed with the GST API, it's been a joy to
learn - it just works! I've been able to decode two codecs so far, the audio
(AAC)
and video (h264) used when in the XMB. Not sure if I'll have problems with
ATRAC3
which is used by some games.
Anyway, thanks for the feedback and I promise not to bother ya (too much)!
Original comment by darryl...@gmail.com
on 5 May 2010 at 10:09
Original issue reported on code.google.com by
snifikino
on 6 Jan 2010 at 6:16