Closed vvasuki closed 4 years ago
I completely agree the current structure is very ordinary (some previous versions were even more :)). However, I don't immediately see the advantage of the new structure (except the memory part)—especially because we will have some 500+ files in this arrangement. I would like to split the festivals into reasonable sized files, perhaps by tag (e.g. kanchi aradhana days), or month and so on, for convenience.
Let's brainstorm some ideas. Perhaps, the choice should be made after carefully studying the code (which I presume you've already considered) and how a new data structure may positively impact. I've been meaning to profile the code for a while. All these ideas are ideal for a 2.0
version, which should begin with completely decoupling computations, perhaps festivals as well, and panchanga TeXing.
I don't immediately see the advantage of the new structure (except the memory part)—especially because we will have some 500+ files in this arrangement.
And that memory disadvantage is a big one. Festival data can grow arbitrarily big (given the descriptions - which I envision having rich details in various languages right in the database and the sheer number of festivals- I envision including various grAma-devatA-utsava-s, finer shrauta and smArta events like sthAlIpAka, AgrayaNa etc..).
The possibility of having 500+ files is not a big disadvantage - provided that they are properly organized and seek-time is O(1) - both for the consuming program and for the contributing human. We should take a step back and look at a more abstract level - that of events, rather than files - which is a particular representation. We ALREADY have 500+ events (which happen to have been dumped into a single file). Just adding a new festival (corresponding to a particular event) involves the contributor doing some crude mental binary search (where should I add it? Is it already there? etc..). This is a hassle, which the new system will do away with.
I would like to split the festivals into reasonable sized files, perhaps by tag (e.g. kanchi aradhana days), or month and so on, for convenience.
Splitting by tag is something I would dissuade - same old sequential search problem. Splitting by month is better - but I would still prefer splitting by day given the sheer number of events of note. As a parallel - In jyotiSha package, I have found myself wanting to split huge files with many many lines of code into smaller files before going and doing some (often minor) change I desire.
Let's brainstorm some ideas. Perhaps, the choice should be made after carefully studying the code (which I presume you've already considered) and how a new data structure may positively impact.
Indeed, I got this idea even as I wanted to have a per-day panchanga API and found festival computation all mixed up with annual panchanga computation. The code should definitely change along with the data and functionality should not be broken - agree on that. That's why I began adding pytest yesterday so that we can setup continuous automated testing with travis.
All these ideas are ideal for a 2.0 version, which should begin with completely decoupling computations, perhaps festivals as well, and panchanga TeXing.
The changes should be done incrementally. Clearing out and making the festival data better is a major impediment to further plans I have (eg - a web service that produces ICS calendar given the location and inclusion + exclusion tags in the URL ). Separting out tex stuff is a big one too - but mostly orthogonal.
Agree with practically everything. I'm excited about the web service that produces an ICS! As an aside, one functionality I'd like there is tag-based selection of events to dump into the ICS.
For solar_month, we have nakshatra as well as tithi based festivals (and rarely, "day-based"). So for a start, can we split by solar and lunar months alone?
Testing would be great too --- I currently test by diffing the TeX file!!
As an aside, one functionality I'd like there is tag-based selection of events to dump into the ICS.
Indeed, अहमपि तथा चिन्तयन्नस्मि। tags_to_include, tags_to_exclude इति सूचने ऽस्मद्यन्त्रैस् स्वीकार्ये।
For solar_month, we have nakshatra as well as tithi based festivals (and rarely, "day-based"). So for a start, can we split by solar and lunar months alone?
Sure!
Testing would be great too --- I currently test by diffing the TeX file!!
I find it convenient to use the local analog of http://api.vedavaapi.org/jyotisha/jyotisha/docs#!/default/get_daily_calendar_handler for an almost-end-to-end test.
@karthikraman - Please spot check https://github.com/sanskrit-coders/jyotisha/tree/master/jyotisha/panchangam/temporal/festival/data and see if all is well.
Have to check carefully, but currently, I am unable to even check it out, as I work on Windows. Folder names like tata:naTarAjar An2i tirumaJcan2am
are not permissible :( -- perhaps we can have ta
as a folder name and migrate the files to ta\ta:naTarAjar An2i tirumaJcan2am
?
Fixed the filenames - using __
instead of :
- please recheck.
Able to check out now. However, the write_panchangam
scripts seem broken. Will look carefully.
priority
can also be moved inside of timing
in the json...
aparahna --> aparaahna?
priority
can also be moved inside oftiming
in the json...
What's the justification? it would be important to document it with the code which defines the expected json structure.
aparahna --> aparaahna?
Sounds good!
priority
can also be moved inside oftiming
in the json...What's the justification? it would be important to document it with the code which defines the expected json structure.
Sure. Priority is a part of timing -- if a particular tithi occurs on two days, priority of puurvaviddha says pick the first and paraviddha says pick the second...
Sure. Priority is a part of timing -- if a particular tithi occurs on two days, priority of puurvaviddha says pick the first and paraviddha says pick the second...
Ah got it - then changing the field to pick_puurvaviddha and the value to a boolean is far clearer and "self-documenting".
Very much; perhaps some comment can carry the term paraviddha
as well, like pick puurvaviddha day rather than paraviddha so that people are aware of the common classification...
ओह् - तर्हि "pick_paraviddha_vs_puurvaviddha" इति नाम कर्तुं शक्यम्।
I like Don Knuth's "literate programming" approach - code should read like a book. Comments can come in the following forms, in order of decreasing preference:
(Comments outside of code are not that useful.)
Aside: any access to a library with this book: http://www.worldcat.org/title/indian-calendric-system/oclc/40418421
Aside: any access to a library with this book: http://www.worldcat.org/title/indian-calendric-system/oclc/40418421
Berkeley has it - will let you know if I can get hold of it.
We've requested the book and my wife may get it in a week or two (barring गर्भहेतुकविलम्बाः). I was curious if you don't have as good an interlibrary loan system to get the book in IITM (I expect my wife to be working in a similar place after next year - hence the question).
To add to https://github.com/sanskrit-coders/jyotisha/issues/17#issuecomment-445467813 , I was just reading up and resummarizing my approach to good coding and stumbled on these 4 point summary of "Clean Code" - one major point of improvement for you would be to have much smaller function and file sizes.
We have an excellent library, but this book weirdly doesn't seem to be available anywhere in India! Maybe I can try placing a request with our librarian and he may have some way of getting it -- didn't try.
Thanks for the clean code tip. Will really work on it. I never envisaged this code becoming this big and useful :) --- have to work on my skills!
"Comments in clean code are almost never needed." 👍
Desiderata
Current status
Currently we're using giant files - especially https://github.com/sanskrit-coders/jyotisha/blob/master/jyotisha/panchangam/data/festival_rules.json .
Drawbacks
This violates [3], [4] and [2] to varying degrees.
Proposed improvement
Let's have this festival data structure -
@karthikraman - किमभिप्रैसि?