sanskrit-lexicon / SKD

Discussion of corrections and other issues pertaining to Sabdakalpadruma dictionary at Sanskrit-Lexicon
0 stars 0 forks source link

SKD replace scans #16

Closed funderburkjim closed 1 year ago

funderburkjim commented 1 year ago

This note provides instructions for replacing the current skd scans with those provided by @maltenth at #14.

Basic idea is to create a new version of the repository https://github.com/sanskrit-lexicon-scans/skd. Then @funderburkjim can pull this repository to Cologne and move the pdfpages folder to the spot expected by servepdf application.

page names

Each page should be converted to a pdf. The file name of each pdf should be consistent with https://github.com/sanskrit-lexicon/csl-websanlexicon/blob/master/v02/distinctfiles/skd/web/webtc/pdffiles.txt, since the servepdf application uses this file to know what page to serve. For instance page 2-010 (page 10 of 2nd volume of skd) is retrieved by the url https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=skd&page=2-010

image

In pdffiles.txt 2-010 is associated with file name pg2_010.pdf at the line 2-010:pg2_010.pdf:kawItala.

@Andhrabharati Once you have a local folder with all the pdf pages appropriately named, then let us consider in more detail the logistics involved with the sanskrit-lexicon-scans repository.

Andhrabharati commented 1 year ago

In pdffiles.txt 2-010 is associated with file name pg2_010.pdf at the line 2-010:pg2_010.pdf:kawItala.

@Andhrabharati Once you have a local folder with all the pdf pages appropriately named, then let us consider in more detail the logistics involved with the sanskrit-lexicon-scans repository.

@funderburkjim Here are the first pages of the 5 volumes-

pg1_001.pdf pg2_001.pdf pg3_001.pdf pg4_001.pdf pg5_001.pdf

and the last pages of the resp. volumes-

pg1_315.pdf pg2_937.pdf pg3_792.pdf pg4_565.pdf pg5_555.pdf

As I had suggested earlier, first tried reducing the size by 25 times and the file quality is not satisfactory; hence resorted to "20 times" reduction instead. [The overall size is about 1.2 GB for the dictionary pages; the intro pages are skipped in this lot!!]

What's the next step to do?

funderburkjim commented 1 year ago

I have

  1. renamed sanskrit-lexicon-scans/skd to skd-v0
  2. created a new repository skd
  3. invited you Andhrabharati as a collaborator.

You should get email from github regarding invitation. You should accept the invitation.

When I get notification of your acceptance, we can proceed.

Note: I don't do github management stuff very often. So there may be a couple of false starts.

Can I assume you have command-line access to github? (Such as via gitbash for windows) ?

If you don't have command line access,, we can go some other route.

funderburkjim commented 1 year ago

@Andhrabharati adding this comment so you'll be sure to see the previous comment.

Andhrabharati commented 1 year ago

I never tried command line access, so not sure about it.

I am using Github Desktop app, and have access to the repo files through it.

funderburkjim commented 1 year ago

I don't use Github Desktop, so let's go to plan B.

Put your images in a folder (or zip) somewhere in the cloud. And provide a url that I can use to download.

Andhrabharati commented 1 year ago

I am sending them to funderberkjim/skd repo.

Thought you wanted them there.

funderburkjim commented 1 year ago

That should be ok. I'll check back in a couple of hours and look at funderburkjim/skd (note spelling: ..burk..)

Andhrabharati commented 1 year ago

Sorry for my wrong spelling.

Pl. note that I have stored the scan pages in different folders volume-wise (1 to 5) for my convenience, and uploading them as such.

funderburkjim commented 1 year ago

I see 5 volumes. Ready for me to work with?

drdhaval2785 commented 1 year ago

@funderburkjim

Just a note on Github Desktop. I have worked with both CLI and Desktop app. Desktop app has buttons for add, commit and push. So whatever files one chooses and pushes these buttons will be processed. Outcome is identical.

Andhrabharati commented 1 year ago

Yes pl. @funderburkjim

funderburkjim commented 1 year ago

Am close to end of the MW accent review. It may be several days before I finish skd scan install.

Andhrabharati commented 1 year ago

@funderburkjim

I wanted to say that you should cover the annexure pages of MW as well for accents.

You had mentioned in the beginning that you are leaving those pages, as I had 'seen' them in my working last year (1st quarter).

I would like to remind you again that I was to 'work' on the comp.word headers and groups (specifically), but it did not happen as you wanted to do something before I start it. And for some reason, you did not 'inform' if you had done your intended work, for me to finish my part. So, I had covered many other areas/repos at CDSL from that time.

This definitely is a wrong place to say this, but the occassion dictates to post this message here.

gasyoun commented 1 year ago

This definitely is a wrong place to say this, but the occassion dictates to post this message here.

Agree.

I would like to remind you again that I was to 'work' on the comp.word headers and groups (specifically), but it did not happen as you wanted to do something before I start it.

Of utmost interest.

funderburkjim commented 1 year ago

New scans replaced at Cologne.

12-28-2022
Replace pdfs, i.e., a new version of SKDScanpdf directory

The old pdfs are in a certain location on the Cologne server, e.g.,
 https://www.sanskrit-lexicon.uni-koeln.de/scans/SKDScan/SKDScanpdf/pg2_340.pdf

User (Andhrabharati) uploaded a new version of the pdfs to some temporary
repository, under the same names (e.g. pg2_340.pdf)
The objective is to replace the old pdfs with the new pdfs on the Cologne
server.

1. Dowmload new images into some temporary directory at Cologne (tempnewscans)
2. mv SKDScanpdf SKDScanpdf-v0  # move old images to new location
3. mv tempnewscans SKDScanpdf   #

Now, 
https://www.sanskrit-lexicon.uni-koeln.de/scans/SKDScan/SKDScanpdf/pg2_340.pdf will display the new image pdf.

File size comparison: new scans about twice as many bytes.
> ls  -all SKDScanpdf/pg2_340.pdf
-rw-r--r-- 1 jfunderb uniuser 386518 29. Dez 01:12 SKDScanpdf/pg2_340.pdf
> ls  -all SKDScanpdf-v0/pg2_340.pdf
-rw------- 1 wwwadm1 uniuser 157614  3. Okt 2013  SKDScanpdf-v0/pg2_340.pdf

Temporarily, for comparison purposes, an old page can be accessed. e.g.,
https://www.sanskrit-lexicon.uni-koeln.de/scans/SKDScan/SKDScanpdf-v0/pg2_340.pdf 

SKDScanpdf-v0 will eventually be removed from Cologne server.

Hope @drdhaval2785 and other fans of SKD will find the new images somewhat better.