Closed funderburkjim closed 2 years ago
From the above, we see that 6-0345 is really 6-0329 . and similarly through 6-0391. The image pdf files are in pdfpages directory of 2013 of PWG at Cologne. (PWGScan/2013/web/pdfpages) Make a copy of the volume 6 images and put in temporary sibling directory temp_pdfpages_6:
mkdir temp_pdfpages_6
cp pdfpages/pwg6_*.pdf temp_pdfpages_6/
Now in effect rename the files by running this shell script:
cp temp_pdfpages_6/pwg6-0345.pdf pdfpages/pwg6-0329.pdf
cp temp_pdfpages_6/pwg6-0347.pdf pdfpages/pwg6-0331.pdf
cp temp_pdfpages_6/pwg6-0349.pdf pdfpages/pwg6-0333.pdf
cp temp_pdfpages_6/pwg6-0351.pdf pdfpages/pwg6-0335.pdf
cp temp_pdfpages_6/pwg6-0353.pdf pdfpages/pwg6-0337.pdf
cp temp_pdfpages_6/pwg6-0355.pdf pdfpages/pwg6-0339.pdf
cp temp_pdfpages_6/pwg6-0357.pdf pdfpages/pwg6-0341.pdf
cp temp_pdfpages_6/pwg6-0359.pdf pdfpages/pwg6-0343.pdf
cp temp_pdfpages_6/pwg6-0361.pdf pdfpages/pwg6-0345.pdf
cp temp_pdfpages_6/pwg6-0363.pdf pdfpages/pwg6-0347.pdf
cp temp_pdfpages_6/pwg6-0365.pdf pdfpages/pwg6-0349.pdf
cp temp_pdfpages_6/pwg6-0367.pdf pdfpages/pwg6-0351.pdf
cp temp_pdfpages_6/pwg6-0369.pdf pdfpages/pwg6-0353.pdf
cp temp_pdfpages_6/pwg6-0371.pdf pdfpages/pwg6-0355.pdf
cp temp_pdfpages_6/pwg6-0373.pdf pdfpages/pwg6-0357.pdf
cp temp_pdfpages_6/pwg6-0375.pdf pdfpages/pwg6-0359.pdf
cp temp_pdfpages_6/pwg6-0377.pdf pdfpages/pwg6-0361.pdf
cp temp_pdfpages_6/pwg6-0379.pdf pdfpages/pwg6-0363.pdf
cp temp_pdfpages_6/pwg6-0381.pdf pdfpages/pwg6-0365.pdf
cp temp_pdfpages_6/pwg6-0383.pdf pdfpages/pwg6-0367.pdf
cp temp_pdfpages_6/pwg6-0385.pdf pdfpages/pwg6-0369.pdf
cp temp_pdfpages_6/pwg6-0387.pdf pdfpages/pwg6-0371.pdf
cp temp_pdfpages_6/pwg6-0389.pdf pdfpages/pwg6-0373.pdf
cp temp_pdfpages_6/pwg6-0391.pdf pdfpages/pwg6-0375.pdf
cp temp_pdfpages_6/pwg6-0393.pdf pdfpages/pwg6-0377.pdf
cp temp_pdfpages_6/pwg6-0395.pdf pdfpages/pwg6-0379.pdf
cp temp_pdfpages_6/pwg6-0397.pdf pdfpages/pwg6-0381.pdf
cp temp_pdfpages_6/pwg6-0399.pdf pdfpages/pwg6-0383.pdf
cp temp_pdfpages_6/pwg6-0401.pdf pdfpages/pwg6-0385.pdf
cp temp_pdfpages_6/pwg6-0403.pdf pdfpages/pwg6-0387.pdf
cp temp_pdfpages_6/pwg6-0405.pdf pdfpages/pwg6-0389.pdf
cp temp_pdfpages_6/pwg6-0407.pdf pdfpages/pwg6-0391.pdf
Now (after deleting browser history) we see that 6-0329 through 6-0391 are proper.
We are left with erroneous pwg6-0393.pdf through pwg6-0407.pdf. Next step is to get these images as pdfs.
If we look at the now revised scanned page 6-0327 we see entry for rAjastamba at the first entry of p. 327 and rAji as the last entry of p. 329.
Now look at 6-0329. Here the first entry on p.329 shows as rAmacandracampU
But, according to pwg.txt the next entry after rAji is rAjika!
Wow! very confusing.
Will look for a volume 6 from archive.org.
https://archive.org/details/in.ernet.dli.2015.7348/page/n169/mode/2up
This version seems to align with our version.
BUT IT HAS internal page number errors in the printing.
I'm going to undo what the script above did.
Maybe if we look at the page contents, rather than the internal page numbering (which appears erroneous), things will make sense.
Ran script to restore original image file names:
cp temp_pdfpages_6/pwg6-0329.pdf pdfpages/pwg6-0329.pdf
cp temp_pdfpages_6/pwg6-0331.pdf pdfpages/pwg6-0331.pdf
cp temp_pdfpages_6/pwg6-0333.pdf pdfpages/pwg6-0333.pdf
cp temp_pdfpages_6/pwg6-0335.pdf pdfpages/pwg6-0335.pdf
cp temp_pdfpages_6/pwg6-0337.pdf pdfpages/pwg6-0337.pdf
cp temp_pdfpages_6/pwg6-0339.pdf pdfpages/pwg6-0339.pdf
cp temp_pdfpages_6/pwg6-0341.pdf pdfpages/pwg6-0341.pdf
cp temp_pdfpages_6/pwg6-0343.pdf pdfpages/pwg6-0343.pdf
cp temp_pdfpages_6/pwg6-0345.pdf pdfpages/pwg6-0345.pdf
cp temp_pdfpages_6/pwg6-0347.pdf pdfpages/pwg6-0347.pdf
cp temp_pdfpages_6/pwg6-0349.pdf pdfpages/pwg6-0349.pdf
cp temp_pdfpages_6/pwg6-0351.pdf pdfpages/pwg6-0351.pdf
cp temp_pdfpages_6/pwg6-0353.pdf pdfpages/pwg6-0353.pdf
cp temp_pdfpages_6/pwg6-0355.pdf pdfpages/pwg6-0355.pdf
cp temp_pdfpages_6/pwg6-0357.pdf pdfpages/pwg6-0357.pdf
cp temp_pdfpages_6/pwg6-0359.pdf pdfpages/pwg6-0359.pdf
cp temp_pdfpages_6/pwg6-0361.pdf pdfpages/pwg6-0361.pdf
cp temp_pdfpages_6/pwg6-0363.pdf pdfpages/pwg6-0363.pdf
cp temp_pdfpages_6/pwg6-0365.pdf pdfpages/pwg6-0365.pdf
cp temp_pdfpages_6/pwg6-0367.pdf pdfpages/pwg6-0367.pdf
cp temp_pdfpages_6/pwg6-0369.pdf pdfpages/pwg6-0369.pdf
cp temp_pdfpages_6/pwg6-0371.pdf pdfpages/pwg6-0371.pdf
cp temp_pdfpages_6/pwg6-0373.pdf pdfpages/pwg6-0373.pdf
cp temp_pdfpages_6/pwg6-0375.pdf pdfpages/pwg6-0375.pdf
cp temp_pdfpages_6/pwg6-0377.pdf pdfpages/pwg6-0377.pdf
cp temp_pdfpages_6/pwg6-0379.pdf pdfpages/pwg6-0379.pdf
cp temp_pdfpages_6/pwg6-0381.pdf pdfpages/pwg6-0381.pdf
cp temp_pdfpages_6/pwg6-0383.pdf pdfpages/pwg6-0383.pdf
cp temp_pdfpages_6/pwg6-0385.pdf pdfpages/pwg6-0385.pdf
cp temp_pdfpages_6/pwg6-0387.pdf pdfpages/pwg6-0387.pdf
cp temp_pdfpages_6/pwg6-0389.pdf pdfpages/pwg6-0389.pdf
cp temp_pdfpages_6/pwg6-0391.pdf pdfpages/pwg6-0391.pdf
Now the page numbering is still wrong in the above range. This is a problem inherent in the printed edition. If
So this whole exercise ended up changing nothing. We'll call it a learning experience :).
As an experiment, I used an old copy of Adobe Acrobat to add a text field with the corrected page number to the pdf for 6-0329: https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=pwg&page=6-0329
Do others think this is a good idea, if so can add similar to rest of pages. It's somewhat crude, but might protect some future users from the same confusion I had.
What do you think?
@funderburkjim,
How about using this?- PWG Vol.6 Sp. 301-410.pdf
And you may also look at this- https://github.com/sanskrit-lexicon/PWG/issues/16#issuecomment-846426579
At first glance, your vol. 6 pdf looks quite good -- must be a different printing that corrected the page numbering problem.
Will examine further.
This is from the MLBD reprint of the Japanese edition, which you also happened to see at the archive (the combined book of vols. 1-7).
It is better throughout; the Koeln scans are bad at quite a few places.
Have examined all the page 6-0329 through 6-0407.
Compared the MLBD print (per pdf above) to the current Cologne scans.
For each two-column 'page', looked at first line of column 1 and last line of column 2.
In all cases, they appeared identical.
But the MLBD page numbering is correct.
Noticed generally better page alignment in MLBD.
General print quality appears similar,
Will unpack AB's pdf and insert into appropriate spot on Cologne server.
The new pages are now on Cologne server.
This completes solution of main problem of this issue. Thanks to @Andhrabharati for providing the new images.
The sanskrit-lexicon-scans Github 'organization' has repositories for the scans of all the dictionaries.
Although our software does not currently use these images on Github, we should try to keep this source of images in sync with the 'official' source of images that are on the Cologne server.
This image shows that the pwg images at Github need also to be corrected:
First, clone the https://github.com/sanskrit-lexicon-scans/pwg to local machine
git clone https://github.com/sanskrit-lexicon-scans/pwg.git
Cloning into 'pwg'...
remote: Enumerating objects: 4745, done.
remote: Total 4745 (delta 0), reused 0 (delta 0), pack-reused 4745
Receiving objects: 100% (4745/4745), 1.40 GiB | 5.02 MiB/s, done.
Resolving deltas: 100% (1/1), done.
Updating files: 100% (4740/4740), done.
Second, copy the 'rename' pages into the pdfpages folder of cloned local sanskrit-lexicon-scans/pwg
Third, git add and git push
After removing browser history, same link as above now has the revised page
While improving the RV markup in PWG (#38), I noticed some problems with the scanned image links in Volume 6. @sanskritisampada has checked all the scan links (from 6-0301 through 6-0499). The task was to repeatedly use the servepdf url, starting with https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=PWG&page=6-0301 https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=PWG&page=6-0302 and continuing through https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=PWG&page=6-0499
and to note where the internal page numbers differ from the page number of the url (There are 2 pages in each scan image for PWG). Here are the results
Now I need to get straight what is just a labeling error (problem with pdffiles.txt for PWG) and what scanned images (if any) are missing.