Open danmed opened 1 year ago
Thanks for the feedback! Indeed, I overlooked the importance of the file name because, after all, it is a audiobook for personal use that I won't pay much attention to the file name as long as it is arranged in the correct order and provides chapter names. Besides, I don't want the file names to be too long.
It's feasible to add an option for users to enter the book title, but I'm not sure why the book title should be added in each chapter file. After all, they will all be placed in a parent directory named after the book title.
Is the book you converted https://www.gutenberg.org/ebooks/4686? I tried it and found that, coincidentally, it doesn't have chapter titles at the beginning of each chapter. Its formatting isn't great.
So it outputs those filenames. This tool can't get the chapter name but use the first words instead.
audiobook_output/pg4686_Underground_audiobook
├── 0001_The_Project_Gutenberg_eBook_of_Underground_Hacking_madness.mp3
├── 0002_Things_were_different_however_if_someone_already_had_a_rel.mp3
├── 0003_Zen_was_PIs_sophisticated_younger_sister_Within_two_years.mp3
├── 0004_Ive_just_found_a_bizarre_address_It_is_one_strange_system.mp3
├── 0005_While_these_thoughts_raced_through_Pars_mind_he_stood_rigi.mp3
├── 0006_Over_a_sixmonth_period_in_1992_Tony_Warren_received_more_t.mp3
├── 0007_It_was_maddening_The_frustration_was_unbearable_Each_time.mp3
├── 0008_This_was_the_first_major_computer_hacking_case_in_Australia.mp3
├── 0009_In_the_north_a_dark_cloud_gathered_over_Pad_and_Gandalf_as.mp3
├── 0010_Mendax_logged_into_the_NMELH1_system_by_using_the_account_Pr.mp3
├── 0011_At_his_police_interviews_he_learned_that_an_AFP_officer_had.mp3
├── 0012_Seven_That_was_helpful_Now_to_find_seven_digits_Anthrax.mp3
├── 0013_Its_a_conference_type_of_call_Yes_Here_is_another_do.mp3
├── 0014_SKiMo_is_mostly_a_loner_these_days_He_shares_a_limited_amou.mp3
└── 0015_END_OF_THE_PROJECT_GUTENBERG_EBOOK_UNDERGROUND_HACKING.mp3
This is what its source file looks like which does not contain any chapter info .
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.1//EN' 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'>
<html xmlns="http://www.w3.org/1999/xhtml">
<head><title>The Project Gutenberg eBook of Underground: Hacking, madness and obsession on the electronic frontier, by Suelette Dreyfus</title><meta http-equiv="Content-Style-Type" content="text/css"/><meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8"/><link href="0.css" rel="stylesheet" type="text/css"/>
<link href="pgepub.css" rel="stylesheet" type="text/css"/>
<meta name="generator" content="Ebookmaker 0.12.30 by Project Gutenberg"/>
</head>
<body class="x-ebookmaker x-ebookmaker-2"><p id="id00857">While these thoughts raced through Par's mind, he stood rigid, his
feet glued to the cement floor, his face locked into the probing gaze
of the Secret Service agent. He felt like they were the only two
people who existed in the universe.</p>
<p id="id00858">Then, inexplicably, the agent looked away. He swivelled around to
finish his conversation with another agent. It was as if he had never
even seen the fugitive.</p>
<p id="id00859">Par stood, suspended and unbelieving. Somehow it seemed impossible. He
began to edge the rest of the way to his motel room. Slowly, casually,
he slid inside and shut the door behind him.</p>
And this is what my example epub looks like.
<?xml version='1.0' encoding='utf-8'?>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
<head>
<meta charset="utf-8"/><title>The Project Gutenberg eBook of The Life and Adventures of Robinson Crusoe, by Daniel Defoe</title>
<link href="0.css" rel="stylesheet" type="text/css"/>
<link href="1.css" rel="stylesheet" type="text/css"/>
<link href="pgepub.css" rel="stylesheet" type="text/css"/>
<meta name="generator" content="Ebookmaker 0.12.29 by Project Gutenberg"/>
</head>
<body class="x-ebookmaker x-ebookmaker-3"><div class="chapter" id="pgepubid00005">
<h2><a id="chap03"/>CHAPTER III.<br/>
WRECKED ON A DESERT ISLAND</h2>
<p>
After this stop, we made on to the southward continually for ten or twelve
days, living very sparingly on our provisions, which began to abate very much,
and going no oftener to the shore than we were obliged to for fresh water. My
design in this was to make the river Gambia or Senegal, that is to say anywhere
about the Cape de Verde, where I was in hopes to meet with some European ship;
and if I did not, I knew not what course I had to take, but to seek for the
islands, or perish there among the negroes. I knew that all the ships from
Europe, which sailed either to the coast of Guinea or to Brazil, or to the East
Indies, made this cape, or those islands; and, in a word, I put the whole of my
fortune upon this single point, either that I must meet with some ship or must
perish.
</p>
And outputs filename with chapter info like below.
Is the book you converted https://www.gutenberg.org/ebooks/4686? I tried it and found that, coincidentally, it doesn't have chapter titles at the beginning of each chapter. Its formatting isn't great.
So it outputs those filenames. This tool can't get the chapter name but use the first words instead.
audiobook_output/pg4686_Underground_audiobook ├── 0001_The_Project_Gutenberg_eBook_of_Underground_Hacking_madness.mp3 ├── 0002_Things_were_different_however_if_someone_already_had_a_rel.mp3 ├── 0003_Zen_was_PIs_sophisticated_younger_sister_Within_two_years.mp3 ├── 0004_Ive_just_found_a_bizarre_address_It_is_one_strange_system.mp3 ├── 0005_While_these_thoughts_raced_through_Pars_mind_he_stood_rigi.mp3 ├── 0006_Over_a_sixmonth_period_in_1992_Tony_Warren_received_more_t.mp3 ├── 0007_It_was_maddening_The_frustration_was_unbearable_Each_time.mp3 ├── 0008_This_was_the_first_major_computer_hacking_case_in_Australia.mp3 ├── 0009_In_the_north_a_dark_cloud_gathered_over_Pad_and_Gandalf_as.mp3 ├── 0010_Mendax_logged_into_the_NMELH1_system_by_using_the_account_Pr.mp3 ├── 0011_At_his_police_interviews_he_learned_that_an_AFP_officer_had.mp3 ├── 0012_Seven_That_was_helpful_Now_to_find_seven_digits_Anthrax.mp3 ├── 0013_Its_a_conference_type_of_call_Yes_Here_is_another_do.mp3 ├── 0014_SKiMo_is_mostly_a_loner_these_days_He_shares_a_limited_amou.mp3 └── 0015_END_OF_THE_PROJECT_GUTENBERG_EBOOK_UNDERGROUND_HACKING.mp3
This is what its source file looks like which does not contain any chapter info .
<?xml version='1.0' encoding='utf-8'?> <!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.1//EN' 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'> <html xmlns="http://www.w3.org/1999/xhtml"> <head><title>The Project Gutenberg eBook of Underground: Hacking, madness and obsession on the electronic frontier, by Suelette Dreyfus</title><meta http-equiv="Content-Style-Type" content="text/css"/><meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8"/><link href="0.css" rel="stylesheet" type="text/css"/> <link href="pgepub.css" rel="stylesheet" type="text/css"/> <meta name="generator" content="Ebookmaker 0.12.30 by Project Gutenberg"/> </head> <body class="x-ebookmaker x-ebookmaker-2"><p id="id00857">While these thoughts raced through Par's mind, he stood rigid, his feet glued to the cement floor, his face locked into the probing gaze of the Secret Service agent. He felt like they were the only two people who existed in the universe.</p> <p id="id00858">Then, inexplicably, the agent looked away. He swivelled around to finish his conversation with another agent. It was as if he had never even seen the fugitive.</p> <p id="id00859">Par stood, suspended and unbelieving. Somehow it seemed impossible. He began to edge the rest of the way to his motel room. Slowly, casually, he slid inside and shut the door behind him.</p>
And this is what my example epub looks like.
<?xml version='1.0' encoding='utf-8'?> <html xmlns="http://www.w3.org/1999/xhtml" lang="en"> <head> <meta charset="utf-8"/><title>The Project Gutenberg eBook of The Life and Adventures of Robinson Crusoe, by Daniel Defoe</title> <link href="0.css" rel="stylesheet" type="text/css"/> <link href="1.css" rel="stylesheet" type="text/css"/> <link href="pgepub.css" rel="stylesheet" type="text/css"/> <meta name="generator" content="Ebookmaker 0.12.29 by Project Gutenberg"/> </head> <body class="x-ebookmaker x-ebookmaker-3"><div class="chapter" id="pgepubid00005"> <h2><a id="chap03"/>CHAPTER III.<br/> WRECKED ON A DESERT ISLAND</h2> <p> After this stop, we made on to the southward continually for ten or twelve days, living very sparingly on our provisions, which began to abate very much, and going no oftener to the shore than we were obliged to for fresh water. My design in this was to make the river Gambia or Senegal, that is to say anywhere about the Cape de Verde, where I was in hopes to meet with some European ship; and if I did not, I knew not what course I had to take, but to seek for the islands, or perish there among the negroes. I knew that all the ships from Europe, which sailed either to the coast of Guinea or to Brazil, or to the East Indies, made this cape, or those islands; and, in a word, I put the whole of my fortune upon this single point, either that I must meet with some ship or must perish. </p>
And outputs filename with chapter info like below.
Yes that's the one.. and that explains why it looks a bit of a mess to me :)
This is possibly just my OCD but would it be possible to add in some options to set the title of each file to something other than what the script decides?
Example :
Would it also be possible to customise the filenames in a similar fashion?