Open maandree opened 11 years ago
lunasay...you make may day XD
but slow, what is spected for 304 quotes...
in all cases...put the quotes on the pony-metadata can make more slow that a separate archive in a separate directory
Lunasay, the name is just so perfect (and obvious enough). Lu[n]a = Moon, Luna is a Pony.
I did think about some idea's earlier, but this is none I think really is good:
304, and that just the best quotes, imagine if we included every good quote.
Hmm, why write a helperprogram in lua? It feels both dumb and stupid to mix languages and increase amount of deps again.
So I have to drop my suggestion here, a sqlite database containing all quotes.
No, not a helperprogram. I want to learn lua, so my idea is just to try to do a lua port of ponysay. But only the ponysay.py. A plus of this is if I am able to do that (if it is not just a too unfitting language) there will be a lightweight version (that naturally will not always be up to speed.)
SQLite may be nice. However, multiple databases will be needed. A concern I have with this is, whether the majority of users, fill find it simple to modify (not just add) is personal quote database, without us having to make a toolkit for it. Altough, does that really, matter; will somepony want to do that without knowing how, and without just submitting it to this repo instead.
»Build a fast indexed file whether everything» versus SQLite.
My guess is that the first option is simpler, same performace, but that SQLite will be faster at modifing.
Well, nobody want the database as binary file in the git tree. (I hope).
So we could have the quotes as files in the git-tree, and build a database on "compile"?
Naturally. And may celestia banish anypony what would want buildable binaries in the git tree for any other reason than helping the user get started or reduce make dependencies.
Ok in all cases the PDF is not already a binary archive on the git-tree?? I know are different cases, but a SQLite database IS a new makedep for the Data Base
Off: and Luna is spanish for Moon (as same Sombra is for Shadow)
Ok in all cases the PDF is not already a binary archive on the git-tree??
Not sure what you are asking about, but the only thing binary in the git-tree
is the PDF just to help people get started from the source if they do not
know what to do, as well as the ttyponies. I personally have not tested
util-say on fully free software.
Yes, it is a new make dependency, but is is also runtime dependency.
Luna is Latin (and may other languages). Lua is Portuguese [and sound really nice for a programming language]. Sombra [new for me] is spanished derived from umbra which is Latin.
Adding binaries is not nice, but binary archives [which SQL database would be] is just bad in any case it is used.
Ahh... poor Pinkie Pie, I knocked the desktop to hard and she fell of the monitor [when due I am still using CRT:s], hit the UPS, and fell into a lot of dust.
Well, correct me if I am wrong, but python got a sqlite module. In Gentoo, it's a useflag for python itself... And we could use python to build the database from the source files.
Also, python modules aren't that big and sqlite is common.
Yep but in my Fast search is for python2 only and another but for the entire mysql (I going to investigate it a little more)
It is an optional dependency (but it does not say what it is used for in arch) by python 3. Optional dependencies should be (well, are usally) considered dependencies for program language interpreters.
I am implementing this with dbm which is faster, more trivial, enough function, and has no dependencies. But it has optional dependencies such as gdbm (everypony have gdbm) for maximum performance.
Instead of using a database you could perhaps consider cleaning up your code instead. In my fork where I dropped some unnecessary stuff and rewrote everything else, -q is about as fast as withouth -q.
Using dbm makes the code both cleaner and faster, because you just open() use the database as i dictionary and then close(). I will take a look at your fork later, I am not working too much as ponysay right know; I will look at it when I am begin working on ponysay more consistently again.
Havn't you removed a bunch of functionally in your fork and only use the essentials?
Yep, I think that's a feature^^
Naturally (generally I agree with that), but sometimes you want everything.
as far I can read in the dbm webpage, that tool is deprecated, and the author (in github https://github.com/hilbix/ ) promised update the code to githuib but intead not do that and not update dbm
so I thing that dbm is or forget by uptream, or feature complete and bug free or deprecated by upstream
dbm as in the unix database library: https://en.wikipedia.org/wiki/Dbm , http://docs.python.org/3.3/library/dbm.html
"Today the World, Tomorrow the Solar System" ;)
Naturally (generally I agree with that), but sometimes you want everything.
I think this is true concerning quality, not quantity. The Unix philosophy tells us that it is better to have a tool for every task that does this task very well instead of one tool that does everything but does not do all things that well.
Also, since it is evidently possible to do -q
in a performant manner without using a full-fledged database in the backend I think using a full-fledged database is covering up the problem instead of solving it.
[edit] For reference: the pretty un-optimized code in my fork runs on my machine (AMD Fusion E450 with 2*1.8GHz) with -q
in ~350ms which I think is acceptable for shell startup and similar tasks. [/edit]
@maandree so, sorry..i because, already exist a program called dbm and i misunderestand you, sorry
@jristz Assumed that, that package should really get a better name, you cannot call your program X if it interacts with the library X.
@maandree right agree, but now is deprecated so no-one care now
@jaseg Well on a desktop you should not notice anything, but run it one a heavily lockdown old netbook and having it start up when you start a new bash session with a lot of stuff. Also Python has only a 300th of the perfomance of C.
dbm is not a full-fledged database, it is the absolute minimum you could imagine. It is just precaching the quotes database, which allows you to just use it as a simple dictionary. [Edit: okay, not the absolute minimum, the minimum is just a list file, this also has an index to speed up the process.]
Additionally it solves the problem where the random selection is bias to pinkie because she hade to be split into three files be quotes because the the long resulting filename; I cannot se any better solution for this.
Finally using a dictionary file allows use to have two dictionary files and 3 random modes. Creating a dictionary that maps pony file → (master file, number of quotes) and a dictionary that maps master file → (pony file, number of quotes)*; lets use either randomly select a pony and prioritise the images appears with the same likelyhood, that the quotes appears with the same likelyhood, or that the the ponies (not the images) appears with the same likelyhood.
@maandree I am using a netbook.
Concerning quotes: Can you really not see a better solution for storing a few hundred lines of text than an embedded database?! óÒ Have a look at https://github.com/jaseg/ponysay/blob/master/ponysay.py#L33
What "random modes" do you think of and why would they be necessary?
The quotes are not being stored the the database.
»What "random modes" do you think of and why would they be necessary?» Not necessary, but it only takes 6 extra times. (They are not implement nor in an issue report.) If not implement the code is of same size.
Edit: Well except there is a build process, and it helps us group togather pony files.
What, then, is the purpose of the database?
What "random modes" would you like to implement? I can only see the obvious one where you take a random quote of the selected pony.
The purpose is the have a lookup table from pony files to quote files.
Currently this is done by having pinkiepie+pinkiebounce+
In the solution, the files currently in /ponyquotes/ are installed, except /ponyquotes/ponies which no longer exists, instead it is built from the metadata in the pony files. The difference in the amount of work making script for creating the database and a script for making symlinks instead is minimal; and I really think not needing to maintain the alias is what we should go for, which both would offer. It wouldn not matter that much which solution to use, except coding with the database makes it easy to implement the following three algorithms for selecting quote:
+Q pony :: pinkie pie is as probable as sweetie belle. +Q file :: each pinkie pie image is as probable, but pinkie pie is more probable than sweetie belle because there are more pinkie pie images then there are sweetie belle images. +Q quote :: all quotes are as probable.
Creating a database requires to collection the same information as making symlinks. Creating the database given the required information takes 4 lines, making the symlinks takes 2 lines.
Creating a database for storing 100s of lines of text is one additional layer on top of the file system to do something the file system is perfectly capable of doing without any help. Also, I do not think the python dbm module works with python package resources.
The three behaviors you mentioned are trivial in any case, though I do not see any need for ① and ③.
In any case, @maandree you can do a test whit one quote to look if is fast as you say or not? the only way to test the fast-ish in this case is a real test, maybe the win can by infinitecimal comparyng whit others ideas, like the one used now in jaseg-ponysay or vice versa
I have noticed that the ponyquotes is slow, this combined with the documented fact that GNU Bash is slow (I guess Bash very common) makes it horrible for using when the shell starts. And probably infeasible for legacy computers.
I do not think there is anything wrong with the implementation, but 3.0 will enhance it based on pony file meta data. Hope fully it will fixed
If 3.0 does not fix this, the two things that can be done. Either profile it (if there is nothing obviously bad in the implementation, which I do not know how to do in python, but probably it is not too hard) the other possibility is one thing I may do anyway just for fun and to learn Lua, and that is port ponysay.py (only that file) to Lua and call it lunasay. [Lua should faster and use less memory but a bit more difficult.]