Closed GoogleCodeExporter closed 9 years ago
Sounds interesting. Would be cool if someone could look into it.
For me, I only use compression (usually lvl 1 thres 01) because I want to get
rid of the (nulled) update files and padding (around 100MB on new games). A
whole lot of new games don't even work with any kind of compression, they are
horribly lagging then.
Original comment by catnip...@gmail.com
on 22 Aug 2011 at 5:54
I already implemented this on my fork but i don't see significative changes in
the reading speed, perhaps you could provide a benchmark to see how much it
changes when compared to cso so we could get this merged with the official
source.
The source code is in my clone repository [1] and a simple tool to convert iso
to zso (compressed iso using lzo) is in the contrib directory. You can filter
the commits by author to find my specific patches to isoread and vsh.
[1] http://code.google.com/r/codestation404-procfw/source/list
Original comment by codestat...@gmail.com
on 25 Aug 2011 at 8:26
Here is a patch who applies cleanly over the current procfw repo HEAD and
enables lzo support, is better that read through my clone repo for the changes.
Original comment by codestat...@gmail.com
on 25 Aug 2011 at 8:57
Attachments:
As far as I can tell, QuickLZ isn't the same as LZO, seems to be beaten by
about a second at decompression speed on this test (as represented by LZOP):
http://www.maximumcompression.com/data/summary_mf4.php
Original comment by hastur...@gmail.com
on 27 Aug 2011 at 9:09
[deleted comment]
Its the same (see QuickLZ ith mode 3, it has 4.7 secs too). I also made the
tests with minilzo and it was slower in decompression than QuickLZ by 1-2 secs.
Anyway, seeing my patches is trivial to replace the quicklz calls with the
minilzo ones. I will do it if you are interested.
IMHO, the the other big problem with iso compression is the size of the
compressed block size at 2KiB, anybody knows whats the block size of the
NPUMDIMG files? or are they using a dynamic block size?
Original comment by codestat...@gmail.com
on 27 Aug 2011 at 2:39
honestly this will be a great addition to the project. I know this is _super_
old but I've actually _stopped_ using compressed ISOs because it _increases_
the load times for games! By at least 1-3 seconds for me. So having a faster
compressor would be amazing. Since it's still open I thought I'd add that the
currently compressor(gzip or is it zlib... either way), is by all means "good"
but as far as playing games they load _faster_ with no compression at all than
even with the lowest level of compression.
Anyway this would be nice to see. Also there's a new compression algorithm out
there called lz4 which is _even_ faster than lzo and that might be an
interesting thing to include as a possibility.
Original comment by 133794...@gmail.com
on 21 Mar 2013 at 6:08
I can't edit my previous comment but lzo's much better than it used to be. It's
~2-3x faster than what it was back when this bug was originally opened so it
should be a much better compression library to implement. and I'd love anyone
who did it long time, since I'd love to have compressed isos, that don't take
longer to load.
Original comment by 133794...@gmail.com
on 21 Mar 2013 at 6:22
If I good recall, CSO uses of some hardware acceleration and that is why it is
the only implemented as of now.
Because of the great LZO progresses, I think it's a good idea to implement it.
Original comment by devnonam...@gmail.com
on 25 Aug 2013 at 11:37
Ah OK, well that makes sense then, I don't believe that the lz4 guy is doing
any sort of assembly for mips specifically so that's definitely not something
that I imagine changing as it's just straight up C. He's also made the "HC"
encoder better too, and using that for the CSO would be nice in the ciso.py but
it won't be able to just be a python file anymore. He provides a CLI for people
to use and I imagine that could be put up there for people to do or someone to
just compile for those systems.
Btw, LZ4 HC uses the _same_ decoder as LZ4 stock so it's gotten way better.
Also if I could've I would've edited this to say "lz4" as I found out about it
way too late.
LZ4 has _already_ been used on a psp game for the system with great
effect/performance(according to a minis developer) so that's the one I'd shoot
for as LZO hasn't been updated in a long while. I do believe I meant to say lz4
but I had said lzo at the top there.
But since someone's already gotten the support in there LZO would be much
better than what we currently have which is crusty old deflate.
The lz4 is also clean C, so I'd like to see that moreso if you'd be willing to
put it in there.
Btw this is _most_ current LZO vs LZ4(semi-most current a couple of months old
but not likely much differnece)
Ratio Compress Decompress both in MB/s
LZ4 (r101) 2.084 422 1820
LZO 2.06 2.106 414 600
https://code.google.com/p/lz4/
that's a link to his google code page.
and here is LZ4 HC vs default compression level for zlib.
LZ4 HC (r101) 2.720 25 2080
zlib 1.2.8 -6 3.099 21 300
Way way way faster at decompression which is where this counts as the CISO
driver _won't_ be compressing data, it'll be _decompressing it_
Original comment by 133794...@gmail.com
on 25 Aug 2013 at 11:47
I'm still wondering if LZ4 or LZ4 HC should implemented...
LZ4 HC has a much better ratio and slightly better inflating speed while it is
15x slower to compress.
I think LZ4 HC is still better overall
Original comment by devnonam...@gmail.com
on 26 Aug 2013 at 12:07
I think that the overhead with the cso format is in the block size, 2K is too
small for some games and the psp loses too much time in reading the compressed
blocks and adding them to the internal list. I think that a good approach is to
modify procfw to don't ignore the cso block size so bigger ones can be used.
Also i think that is a good idea to add an option to generate log file with all
the reads and size of these (profiling the reads of the game). This file could
be used to optionally feed the compressor so it can choose the optimal block
size for different parts of the game while compressing the iso. This can be
translated in fewer reads and bigger decompressed blocks in the iso cache.
I cannot find some documentation for lz4, but it supports overlapped
decompression? Lzo supports it so one doesn't need to allocate an extra buffer
to hold the decompressed output (very important in kernel mode since memory is
so limited).
Btw, compress time should be a non issue since is only done once and outside of
the psp.
Original comment by codestat...@gmail.com
on 26 Aug 2013 at 12:38
lz4 _is_ memoryless in decompression. As in it needs _no_ extra memory besides
the block it is currently decompressing. I'm 99.9% sure that that's hwo it's
set up, you can ask the main develper himself, I was talking with him about
using it in another way that was on a mips cpu with 32mb of ram _total_ and was
worried about the memory required for decompression and he told me that
decompression requires _no_ memory at all.
Also lz4 and lz4 HC use the _same_ decompressor. That means you have _one_
decompressor they use the _same_ algorithm. It's more like zlib level 1 vs zlib
level 6 it's the same exact algorithm but with a different amount of CPU time
spent on it and also the dictionary(I'm 99.9% sure).
So thus it just decompresses on the fly, it says "here's the data" and that's
it. it decompresses as it goes in the default one, you can definitely ask him
if you want to be sure I'm not 100% on this as I've not dug into it too much
yet.
I was also talking to him about the memory used b/c the mips cpu doesn't have
an l2 or an l3 and thus memory has to be stolen from the main system, and his
block sizes should allow you to hold them _entirely_ within the l1 dcache.
Now then let me just say though, that I _do not_ know the exact amount of
icache/dcache on l1 for the specific mips cpu on the psp. But I do know that he
got into an argument with someone who was on that project about the memory/cpu
time used. If you'd like I can link to that dicussion as it's also on something
where every single byte is required.
I was talking with that developer about using it for compressed ROM files,
since zlib you'd have to either a) decompress the whole thing into memory, or
well that's it. And paging out he memory is hard and weird. But he made it
clear that it's perfectly fine.
Finally, the developer I was talking about was 1000 tiny claws, the algorithm
is on the BSD license and hte developer specificaly said how glad they were
that the thing was memory-less. Since I'm 99.9% sure that they were making
_sure_ that it ran on psp-1k models which means that 24MB is it.
Also lz4hc is _way_ better at this kind of thing as it's what he designed it
for. Well to be honest, he designed LZ4HC for packet compression and is still
working on inter-block compression to make it compress better as it holds no
"state" at all. Each block is its own thing. It throws away the dictionary etc
after compressing each block. He is working on making lz4hc during
_compression_ allow that inter-blockness and keep the dictionary there for
streaming and it'll be on as "not by default" for the forseeable future.
To summarize it all up, the decompressor holds no state at all, you get
whatever data comes out of it. If you specify "I want block 1 through 3". It
gives you block 1-3. It just hands you back them to however you were wanting to
use them as each block throws away everything after the compression. Also I
don't know what kind of block size you'd want to do, it does up to 20MB but I'm
sure you guys would go with a smaller blocksize. They seem to allow 64k-4MB. I
guess that 20 was made up in my head. either way I'm sure this'd _greatly_ help
the performance of that system completely.
Once again I'd definitely talk with the author about any of the innerworkings
of the algorithm b/c he knows _way_ more about it with me. I'll drop his
twitter handle below here so you can give him a message there, or open an issue
on his google code as he's very responsive about issues.
https://twitter.com/Cyan4973
that's his twitter message. The developer that I was talking about was porting
GPsp to another mips architecture and had improved it and had to worry about
the total 64MB(probably lower than that do to kernel memory used) but either
way he said after digging into the code itself, that he had no qualms about his
early claims about suboptimal performance on mips due to the lack of l2/l3.
Original comment by 133794...@gmail.com
on 26 Aug 2013 at 12:56
About implementing it in ciso.py, this is possible as there is a bridge for it
here : https://pypi.python.org/pypi/lz4
However, it still requires us to compile the C files
Original comment by devnonam...@gmail.com
on 26 Aug 2013 at 11:28
Ah OK that looks to be really good then, I thought there was a python binding
but I wasn't completely sure. Also I don't know how complex homebrew is on the
PSP but since I don't know how many people use ciso.py(I don't know how many
have python installed to use the cli), you guys could look at making a homebrew
program on the psp so that people can compress/decompress isos/ciso(some other
name for the lz4 based ones) so that it can be more widely used.
Also since you're going to be changing the block size/sector sized used by
ciso, I'd suggest maybe storing at the top of the file values showing where the
original blocks/sectors were stored.
It'd be something like the following, I don't know how contentious memory is
for the kernel space, so whilst I think 4byte values would be way better due to
the original ISO sector size being _only_ 2k I can understand if you can't use
that size of value.
Instead of having to store 2 values eg
compressed_block_number,original_blocks_held, it'd just be compressed_blocks
held. You'd know what the compressed block is by looking at _where_ it is in
the array. index 0 is the first, index 1 is the next etc etc.
So for the first one it'd be lba 0-32/whatever, and the next 33/whatever+1 etc
etc. So when a game says "hey I need lba x to y" you can more easily seek it
out. Now if you _already_ have such a system in place then you shouldn't need
this system there I am unsure to how you've got it setup currently in there.
Also here's an ISO that I just got done redumping using lz4hc vs gzip(not using
the other zip archiver as I'm unsure if it uses zlib or not)
Final Fantasy 7 Crisis Core 1.7GB uncompressed(1716713472)
lz4hc 3.508s compressed to 1.2GB(1229536276)
gzip -6 58.053s compressed to 1.1GB(1095743996)
lz4 decompressed in 1.653s
gzip decompressed in 12.370s
So as you can see lz4hc is _slightly_ slower in compressedion but is way way
faster at both compression and decompression. This was _all_ done off of a
ramdisk but anyway yeah. I just wanted to put that there, I don't know why
really. But anyway I think that list of blocks used would be a good "hack" to
make loading even better so you don't have to seek for stuff as lz4 doesn't do
_any_ kind of structures at all. It just gives you compressed data with a hash
appended to each block(4byte not crc) that's it. No other structure data at all
as far as I'm aware so it's like raw_deflate.
Original comment by 133794...@gmail.com
on 26 Aug 2013 at 2:42
Also here, if you guys can, I'd keep a good half of the "iso cache" as
compressed data after you've read it off of the memory stick since lz4hc
decompresses so bloedly fast. I can imagine that it could even be faster than
reading raw data off of the MS(maybe even with the MS speedup hack) and thus
you could store more data in memory though I don't know how much good that'd be
to be honest, I do know that it works well for the databases out there, and the
kernel etc where they just store the compressed page in memory and evict the
uncompressed one when memory gets tight.
Original comment by 133794...@gmail.com
on 26 Aug 2013 at 2:49
OK I've looked at the source code for lz4 and talked to the guy. Basically all
you need is a buffer to store the output for that block. So all you need is the
following. Memory for the compressed block and memory for the uncompressed
block. Lz4 also doesn't use the same dictionary across blocks of data each one
is unique. So you don't have to keep that dictionary across any blocks.
It does allow you to use the same exact dictionary too if you want to increase
compression ratio but I imagine that you'll have to keep that memory there and
if you're going to do such a thing seeking through it would be way way harder.
How much memory do you guys have in the kernel atm by the way?
Original comment by 133794...@gmail.com
on 22 Nov 2013 at 12:04
OK, I've sat down and looked at lz4 and here's the latest numbers. For the
compression of a complete ISO this is on the _maximum_ mode the only thing I've
clawed back is the block size(to reduce memory used).
Here's how it ended up.
The raw ISO is 428.4MB
lz4 "high compression" with block size of 64KB, it ended up at 416MB.
CISO with it's current one is ~421MB.
Next up, the total memory used for lz4 in the block size of 4(lowest possible).
The peak memory used when decompressing it _all_ was ~688KB during
decompression. I don't _yet_ have lzo to test itself.
When using lzo the peak memory usage on the best compression level ala lz4 hc
it uses ~888KB of total memory. A whole 200KB _more_ than lz4. So your "it
doesn't use any more memory" thing doesn't seem to go valid with me.
P.S. If you're talking about reusing the same dictionary from block to block
then that _does_ require _more_ memory. It almost doubles the total _peak_ ram
that is used. And also by the way this is the _entire_ file so if you're only
decompressing let's say 128KB then you're obviously not going to use that much
memory. The file size difference between lzo and lz4 is ~200KB on teh highest
level. When you do interblock dependency(which I'm sure would make seeking
across the CSO a lot harder since right now you can just start using the
thing). It becomes ~1MB smaller. So either way I hope that this shows you that
Also about the interblock compression thing, it essentially means that it keeps
the compression dictionary across blocks instead of throwing it away after each
of the blocks. So that's all that I wanted to say about it. If you increase the
overall block size of the files/iso driver to 64k that'd obviously increase
compression capability a very very large amount. I'm no python programmer so I
can't tell you how much better it'd get but I hope this shows how well lz4 for
is in memory constrained situations.
Original comment by 133794...@gmail.com
on 23 Nov 2013 at 5:37
I'm doing tests on lz4 on a mips platform but from the source himelf.
Yann Collet @Cyan4973 5 Dec
@133794m3r LZ4 doesn't use any temp buffer. It's straight from the source
buffer to the output one.
So it _doesn't_ require extra memory during decompression even in the
inter-block dependency mode for lz4hc. On a mips platform I have it's ~2-10x as
fast as gzip(level 6 which is similar to zlib level 6)
Original comment by 133794...@gmail.com
on 6 Dec 2013 at 10:57
That's with interblock compression as in reusing the dictionary across blocks.
One thing to warn you if you're doing that in zlib or anything else, you're
going to lose the ability to seek to a random block within the data. So long as
you do the standard mode(lzo/lz4) with a decent block size you'll be OK. Also
lz4 tells you the decompressed size of each block. I don't remember the exact
function off hand but I know the data is there(to help you figure out how much
memory to allocate).
If you're doing lz4hc for the blocks, you can read them in, and then
immediately put them into the output buffer. tehre's no temp buffer. So at
maximum it'd take ~130KB(peak memory usage) for the smallest block size(what I
suggest you do). So that's the peak total memory that you'd need. It'd be
compressed block+uncompressed block. If you up the block size to ~64k(which is
going to make it a ton better by the way) then you can just get the sectors
from the game themselves and just do it as you can/want to.
Original comment by 133794...@gmail.com
on 6 Dec 2013 at 11:06
For more proof how low memory lz4 can be. Here is an example of the
decompressor running on an Apple IIgs.
http://www.brutaldeluxe.fr/products/crossdevtools/lz4/index.html
Original comment by 133794...@gmail.com
on 10 Dec 2013 at 11:55
I started working on it, it seems like we are going to have some trouble
getting the sector size but hopefully it'll work.
Original comment by devnonam...@gmail.com
on 14 Dec 2013 at 12:16
Original comment by devnonam...@gmail.com
on 14 Dec 2013 at 12:17
About the sector size, you may just have to end up doing just 4KB or something
similar. Since that's not _too_ much larger and is more akin to what most
devices use and I know that it does well with it. If you're working on it. I'd
do the HC mode for the lz4 compressor as that's much much more akin to zlib
level compression but also decompresses at the same speed as stock lz4(very
very close).
http://133794m3r.github.io/
That link above is where I did some tests on another mips based device(more
recent ISA), and also did tests on the smallest possible block size in terms of
compression time. It was with the 64KB block size.
So yeah I see why the sector size thing would cause issues, and you'll also
likely have to somehow store a set list of which LBAs are stored within the
compressed blocks so that you have to seek less throughout the thing. The
overall compression seems to be about the same as zlib with 4KB sectors(or so
ays the kernel guys).
I eagerly await the updates on it. Thanks for doing the work.
Original comment by 133794...@gmail.com
on 14 Dec 2013 at 2:16
Actually I will only implement the lz4 decompresser because it is also able
to decompress lz4hc stuff (and as you can see, lz4hc doesn't even feature a
decompresser).
About the compresser, maybe we will have something like ciso.py which will
include the compresser (probably not in Python because that would require
me to port the compresser and I think this is a bad idea because it get
updated, it will be hard to maintain).
The code will probably need some refactoring later because the current ISO
drivers are made to only support CSOs (in the code structure).
Original comment by devnonam...@gmail.com
on 14 Dec 2013 at 11:25
Ah OK, I forgot about that. And about the tools to do it. I'm going to look
into trying to write up some code to do the CISO tool to do lz4. You may want
to change the CSO header to include something different to make sure that lower
versions of procfw/others won't be trying to open up the file needlessly.
Also depending on the block size, if it's more than let's say 4KB, you may want
nto include at the front a list of values. 2 values both 32bit numbers.
It'd be something like the following, for the first LBA it'd be 0 to the LBA
number.
so. And then it'd be the ending byte for that compressed block.
32:14421
and it'd keep on going to keep from trying to seek randomly throughout all of
the CSO.
Once you figure out how you want to do it, I can whip up a command line program
for linux/windows. I don't have any access to a mac machine and I don't know
about cross-compiling it to that platform and making sure that it works. The
current python program probably could be able to do it. The only thing is that
some people who don't have python couldn't use it. So I'm going to try to write
up a quick program that's commandline based to do the compression of the ISOs
using the block size that you've selected. once you figure it out, update this
bug document or whatever so that I can know what to work with.
Original comment by 133794...@gmail.com
on 14 Dec 2013 at 10:16
Took the bullet and made an LZ4 implementation of the cso format and added
support for it in PRO. I also modified the ciso.py tool to be able to
compress/decompress zso images (LZ4 compressed isos).
To enable LZ4 compression pass -z to the ciso.py command-line and include a
compression value between 1-8. If you pass a 9 then LZ4 HC will be used instead
for the compression (slower to compress but gets a slightly better ratio).
Very simple LZ4 decompression patches are added on vshctrl and the
galaxy/inferno drivers. Also vshctrl is patched so the .zso files can be
recognized at the XMB.
On my tests i didn't noticed much difference compared to LZO compression so i
started my investigation on why the cso format was so slow to read and ended
making a big optimization for it (only on the inferno driver). Gonna try
explaining it below:
The current cso decompression method tries to read and decompress every gzip
compressed block one by one. For example a 80KiB read needs 40 sceIoRead reads
of 2KiB for the blocks alone (plus any cso index reads that it needs). This
excessive access to the memory stick slows down the whole cso reading making
the gzip decompressión time not important.
On my method i reduce the total I/O to a max of 4 reads: one for the index and
a max of three for the compressed blocks, that i read in one go. If the block
and requested size if aligned to the ISO sector then it only needs a total of
two reads to the memory stick.
On my tests i managed to play GTA Liberty city stories from a CSO compressed at
level 9 (cpu at 333 and ms access speedup enabled) without virtually any lag
whatsoever. Also tested other 2 compressed games with similar results.
I leave this patch so it can be reviewed for possible implementation bugs and
hopefully can be merged on procfw.
Original comment by codestat...@gmail.com
on 16 Apr 2014 at 12:10
Attachments:
Patch updated, now it maintains a partial cache of the index table so doesn't
need to be loaded on every read.
Got performance improvements on GoW: GoS (prologue video) compared to the first
patch.
Original comment by codestat...@gmail.com
on 17 Apr 2014 at 7:16
Attachments:
Sorry, wrong patch attached.
Original comment by codestat...@gmail.com
on 17 Apr 2014 at 10:46
Attachments:
New patch. This solves the problem with Jeckpack Joyride (tries to read beyond
the iso). The updated code now manages the corner cases of reading the last
block, reading with an invalid offset (returns 0) and size going beyond the iso
size (the size gets readjusted).
Original comment by codestat...@gmail.com
on 14 May 2014 at 3:33
Attachments:
sorry for all of the not posting stuff. I'm currently getting reayd to finally
start to try to test this thing with various games. I was busy and dealing with
horrible news. Not going to repeat here, but it kinda took my life for a very
long time. Anyway thanks for the updates and I'm going to build procfw and test
it on my psp with various games.
Do you know of any that are partciularly difficult to run? For example I have
quite a few of the psn releases that were pkgs which I made into isos would
they be likely to find random bugs within the new driver?
Original comment by 133794...@gmail.com
on 13 Jun 2014 at 9:03
Hello, thank you for your help.
Before merging this patch, we want to make sure that it doesn't affect the
compatibility negatively. That's why any game is worth testing.
If possible, please test it when compressed in GZIP and then LZ4, so we can
make sure that both work properly.
Original comment by devnonam...@gmail.com
on 13 Jun 2014 at 9:09
Ah OK, I didn't know if there were any issues still as you said earlier about
jetpack joyride _not_ working for some reason and I thought that was to due
with lz4 in ciso.py
I'm going to be trying out games that I know have been pretty stable for me in
the past and work completely AOK. There is one game that seems to _never_ work
for me under procfw-c2 which is the dungeon siege game I get far into the game
and hten it bugs out and stops working.
I don't know if this is the game itself as I've not seen anyone else have this
issue in any other site but if you want I can provide the save file for it.
Original comment by 133794...@gmail.com
on 13 Jun 2014 at 9:27
Made a testcase [1] to check that the algorithm is correct and doesn't have
memory leaks. Compile it with:
gcc csotest.c -o csotest -lz -llz4 -lssl -lcrypto -fstack-protector
Run it with:
./csotest some.cso load.txt
Also run it with:
valgrind ./csotest some.cso load.txt
To make sure that no memory leaks, buffer overflows or any invalid reads/writes
were done. Made a test with Pool Hall pro ISO/CSO/ZSO (read the offsets/size in
the load file 500 times.):
ISO (just to have a reference point):
time spent (old method): 1.186666
time spent (new method): 1.174465
CSO:
time spent (old method): 7.853323
time spent (new method): 6.964370
ZSO:
time spent (old method): 2.070728
time spent (new method): 1.377143
So on my PC there is ±0.01 seconds of error margin (according to the ISO read).
[1] https://gist.github.com/codestation/bf1cc67ddf7c490c9626
Original comment by codestat...@gmail.com
on 15 Jun 2014 at 2:09
right here is the tests on my own computer, I did star ocean first departure,
knights in the nightmare, and puyo pop fever. Each one of varying size and I
believe time when they were developed. I'm posting as a pastbin link since I
don't want to fill this thing with a ton of extra stuff.
It looks like they're very similar and I'll saying if valgrind found anything
below instead of the whole "found nothing" bit. Here's the link to the tests
without valgrind's data included as that just makes it a ton slower and thus I
found no reason to actually post thoses ones.
http://pastebin.com/raw.php?i=yBMgtTjd
ok there's zero memory leaks and I tried three games, with a total of 9 tests
in normal and also valgrind and it seems to be AOK as stated all was done on a
ramdisk so it should be fine.
Original comment by 133794...@gmail.com
on 16 Jun 2014 at 1:41
OK, you can maybe call me crazy but I am not seeing the zso files on my memory
stick. They're simply not listed. I don't know if this is due to a bug in the
vsh menu or what, but I don't see them there at all and I know that they're
there, as I pasted them from my tests folder to see how they would play on my
psp and the files aren't listed in vsh. I don't know if this is normal, or
what? And anyway yeah have you tried it out yourself with zsos?
By the way is there some sort of logging program so that I can figure out why
vsh isn't showing my zsos? I did the patch as was required as far as I know, as
it has references to zsos/lz4 in all of the right places put out via the patch.
Original comment by 133794...@gmail.com
on 17 Jun 2014 at 5:06
Very weird. Are you 100% sure that the installer is writting the vshctrl.prx
file? If you were already using Pro, did you force the reinstall of the cfw?
(hold L while installing). Vshctrl is patched to recognize .zso extensions and
added lz4 routines so it can load the previews on the xmb.
About logging you could add some sceIoWrite statements and use the fd = 1
(stdout), and capture it with psplink. Better than enabling DEBUG and slowing
down the whole cfw.
Original comment by codestat...@gmail.com
on 17 Jun 2014 at 5:25
Well I'm on a psp 3k model so I seriously doubt that it'd be doing much of
anything with any such file in ram unless it is. I'll try to run the proupdate
to just see if it'll do anything then since I always figured on a 3k just run
fast recovery and you're good to go as it showed up the csos and also the isos
whereas the psp ofw would just show broken game for the isos.
Original comment by 133794...@gmail.com
on 17 Jun 2014 at 7:40
it worked I guess I'm just a stupid then, I didn't know that the psp 3k could
do anything like installing a cfw that or it had to replace it in ram or
something either way it's now working 100% apparently.
Original comment by 133794...@gmail.com
on 17 Jun 2014 at 7:44
[deleted comment]
Good, this new patch seems pretty robust.
We'll do some further testing, and then I'll merge it if we don't find any
issue.
Original comment by devnonam...@gmail.com
on 17 Jun 2014 at 7:49
I'm currently testing the same games on my psp as we speak to see if anything
seems the different, the one thing that goes along with your changes is that
the activity light on the memory stick seems to be flasing less. I haven't
tried dungeon siege yet as that game I don't know what it's doing whilst
loading because even from the memory stick as an iso it takes almost a minute
to do it's loading but I'm checking the other ones to see that they all play OK
then.
P.S. the pastebin with the values of the various games, does that seem to be in
line with what you saw yourself/the load.txt should I try to use a larger game
with it as it seems the offsets are in the gigabytes and the largest game I
know if is FinalFantasy Type-0 which clocks in at 2.5GB
Original comment by 133794...@gmail.com
on 17 Jun 2014 at 8:00
Edit to say, i tried fftype0 with the 500 reads test and the results were
similar to the other ones and have tried a few games and I'm currently playing
one without any real issues right now.
Since I can't find anything about it anywhere on the wiki and w/o having to
learn the source code. What's the cache's number, is that blocks or reads? or
what exactly. since I'd like to use the iso cache as well as possible. Finally
is the iso cache's cache the raw blocks still compressed, or are you storing
them uncompressed?
Original comment by 133794...@gmail.com
on 18 Jun 2014 at 3:25
OK I found an issue with it, when I resume from sleep with a zso I don't
remember if this happened with csos or not as I have never played this exact
game before but I have seen it happen from untold legends which I have played
via an iso. The game responds to some input but then when it tries to do the
first loading screen I see the activity light like it's reading but it just
freezes and I have to return to the home menu to restart the game again. I
don't know what sort of debugging stuff I should do to try to see if the
program is no longer responding or not. I may end up trying it with plain old
iso to see if it's the new loader or not.
P.S. IsoCache is at 23MB, LRU, 512. cpu is 333/166. Vsh is 100/50, and ms
speedup is always. Driver is inferno.
Original comment by 133794...@gmail.com
on 18 Jun 2014 at 6:08
OK it's just that game itself only I tried it with a plain iso and loading up
the save file and apparently it borked its own save file somehow as I had to go
back into the previous level any of them to get it going but that seems to be
it. I've also been testing other games and cannot find any real issues with
them really. As far as the access lite it seems to be on less than it used to
be. I'm going to be continuing to test with various other games with the cso
version also to see if they show anything odd but it seems to be just that one
game. I've tried playing ~11 games thusfar in zso format for ~30-45min without
any real issues shown.
Original comment by 133794...@gmail.com
on 18 Jun 2014 at 11:06
Wake from sleep is causing the games to freeze they won't respond to input but
the psphome button will work. i've disabled all plugins except "noumd.prx" and
it's still happening. I've even tried to reduce the caches thinking it may just
be that there's some weird bug in that. I believe this is just lz4 compressed
files when it does it the only way to fix it is to do a cold reboot. I'm going
to retry with plain isos and try to do some more testing as I don't know what
part of the code is responsible for wake from sleep that might help me.
Original comment by 133794...@gmail.com
on 27 Jun 2014 at 6:58
Sorry for the delay, i have been very busy with my job. I reproduced the bug
and tracked it down to a sceIoread where i wasn't checking the return value. It
seems that if one attempts to read just after the psp returns from sleep the
I/O functions return SCE_KERNEL_ERROR_DRIVER_DELETED. I changed the code that
reads the cso index to use read_raw_data instead of a plain sceIoRead since
this function makes a handful of read retries before giving up.
This wasn't a problem with the algorithm so it remains without changes. I
attached a new patch with the changed read method. #46, can you retry your
tests with the new patch?
Original comment by codestat...@gmail.com
on 8 Jul 2014 at 2:39
Attachments:
I was on the irc trying to get help with debugging it but everyone just kinda
said 'oh it's probably a plugin that's causing it'. Instead of helping so I
wasn't able to do more debugging since I had no idea where to start putting the
printing at and where to look at stuff.
So I'm also sorry for not being able to help you debug it more as I got at a
dead end and didn't want to fill up this thing with random comments.
I'll download the patch and start testing it again as I said dungeon siege had
a weird error and it might've been the same thing.
If you know what code I need to look at to put in the printf's to find the
errors at that'd be great in case I can find another bug on here.
Original comment by 133794...@gmail.com
on 8 Jul 2014 at 2:52
Personally, i put a print statement in the iso_read function (just before the
read_cso_data_ng call) so i can get the last offset/size that the driver tried
to read before the crash (thats how i get my load.txt files to test it on the
pc using the testcase). Just declare a char buf[256] somewhere and use this:
sprintf(buf, "offset: %i, size: %i\n", args->offset, args->size);
sceIoWrite(1, buf, sizeof(buf));
"1" is stdout so it gets printed to psplink (make sure to disable nodrm, it
interferes with the stdout write, dunno why).
To find this bug i added that print statement after every read_raw_data and
tagged then differently so i knew in what region it was getting stuck.Also i
dissasembled inferno.prx and checked where the EPC was located (it crashed
inside the LZ4 decompress func so i knew that it was a bad read buffer).
Original comment by codestat...@gmail.com
on 8 Jul 2014 at 3:05
That sounds a bit crazy but I'll try to keep it in mind if I can track it down
again to something around that. This may have also been what was causing
dungeon siege when doing multiple loads to fail to read.
I don't know if you know of a game that seems to do more reads on the memory
stick/doing more during loading but it seems to be it for me. Even other games
by the same company don't take as long/keep the read light on as much on the
memory stick. I'll recompile and test again.
Also it looks like the psp is using some sort of unix then for it's base os as
I know stdout is 1 and stderr is 2. I'll try it in the future but when it gets
to the dissasembling inferno.prx and then debugging it I'd probably get lost
there.
Finally as far as the psp's compiling options goes I've been using the
following as I do with all other programs that I know cannot be easily debugged.
-fweb -fgcse-las -fgcse-sm -fgcse-after-reload -fpeel-loops
I know that the first one makes debugging impossible but since I've been using
it on the gcwzero and on the psp where you can't actually do gdb on the whole
thing I still use it anyway. As far as perf goes it seems make ~3-5% faster.
Also I've been looking at gcc and I'm hoping that they finally do the thing
where they change -O3 to be more akin to -Os where it makes the code try to fit
within the caches. Also the psp sdk's gcc is there any reason why it has to be
that old of a version of gcc?
Finally for real, I'll be compiling it in a second and then doing the tests on
my psp again along with the csotest on a ramdisk and repeatedly doing loads on
it.
Original comment by 133794...@gmail.com
on 8 Jul 2014 at 6:07
Original issue reported on code.google.com by
hastur...@gmail.com
on 22 Aug 2011 at 10:45