KaniyamFoundation / ProjectIdeas

A Place to write down the project ideas and to plan them
39 stars 3 forks source link

Shodhganga Tamil Theses #220

Open IngersolNorway opened 2 months ago

IngersolNorway commented 2 months ago

Can anyone help us extract Tamil theses from Shodhganga? There are over 5,000 PhD Tamil theses, each containing around 10 PDFs. We need each thesis to be saved as a separate ZIP folder, containing its PDFs and the Shodhganga index page as TXT file.

https://shodhganga.inflibnet.ac.in/handle/10603/56100

https://shodhganga.inflibnet.ac.in/handle/10603/33193

Please help.

Regards, Ingersol

gurulenin commented 2 months ago

I can help with this. I downloaded and merged Alagappa University, Tamil theses. It is around 170 count with 13 gb in size. Each thesis is a single pdf file.

On Fri, Jul 5, 2024, 1:24 PM Ingersol Norway @.***> wrote:

Can anyone help us extract Tamil theses from Shodhganga? There are over 5,000 Tamil theses, each containing around 10 PDFs. We need each thesis to be saved as a separate ZIP folder, containing its PDFs and the Shodhganga index page as TXT file.

https://shodhganga.inflibnet.ac.in/handle/10603/56100

https://shodhganga.inflibnet.ac.in/handle/10603/33193

Please help.

Regards, Ingersol

— Reply to this email directly, view it on GitHub https://github.com/KaniyamFoundation/ProjectIdeas/issues/220, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADU2EHTCXMHVXMUTJB6HNCTZKZGJ3AVCNFSM6AAAAABKMWT2X2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4TEMBVHA3TEMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

gurulenin commented 2 months ago

alagappa-university-tamil-theses-downloader.txt

gurulenin commented 2 months ago

This is the bash script to download all Tamil theses of Alagappa University. Just change the extension, mark it as executable, and run it via the command line. Also, install the pdftk tool.

alagappa-university-tamil-theses-downloader.txt

gurulenin commented 2 months ago

here is the pdf links. use any download manager to download all.

alu-tamil-theses-pdf-links.txt

IngersolNorway commented 2 months ago

alu-tamil-theses-pdf-links.txt

FYI... It has only 170 titles

IngersolNorway commented 2 months ago

Could you create a text file in the same manner, including the title and PDF list, for these 1120 theses?

https://shodhganga.inflibnet.ac.in/handle/10603/33193

IngersolNorway commented 2 months ago

Hi Guru Lenin,

Can I have your contact number?

Regards,

Ingersol Selvaraj

M +47 46 24 90 46

On Fri, Jul 5, 2024 at 3:02 PM gurulenin @.***> wrote:

here is the pdf links. use any download manager to download all.

alu-tamil-theses-pdf-links.txt https://github.com/user-attachments/files/16110024/alu-tamil-theses-pdf-links.txt

— Reply to this email directly, view it on GitHub https://github.com/KaniyamFoundation/ProjectIdeas/issues/220#issuecomment-2210840918, or unsubscribe https://github.com/notifications/unsubscribe-auth/AS4MHZ3TXRKEGVBHDQOPGH3ZK2KP7AVCNFSM6AAAAABKMWT2X2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJQHA2DAOJRHA . You are receiving this because you authored the thread.Message ID: @.***>

IngersolNorway commented 2 months ago

Can you share me this 13 gb pdf files. ingersol.Selvaraj@gmail.com

gurulenin commented 2 months ago

Hai

Ping me in telegram

@gurulenin

On Tue, Jul 9, 2024, 12:31 PM Ingersol Norway @.***> wrote:

Can you share me this 13 gb pdf files. @.***

— Reply to this email directly, view it on GitHub https://github.com/KaniyamFoundation/ProjectIdeas/issues/220#issuecomment-2216751096, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADU2EHUIS5J6A6KCOHBENQTZLODDXAVCNFSM6AAAAABKMWT2X2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJWG42TCMBZGY . You are receiving this because you commented.Message ID: @.***>