Open andy5995 opened 6 years ago
Did you still have the problem? That may be because there was not a folder called tag when the script is called. I fixed it in the new script.
我把 _post 中的文件分别放到新建的子目录下以后就无法生成标签了,可否麻烦更新一下?谢谢!
Hi, I am using jekyll to build my site and I have a similar problem, and in fact after running the script, it ended up deleting my existing manually-created tagname.md pages. I think the issue possibly has to do with the fact that it's looking through
post_dir = '_posts/'
however I want it to generate tags that are in other pages, in other places, i don't know if I could do it with site.pages or something? I do not know much about python. There may also be a different issue with it, when I tried moving some of the posts/pages from which I want to generate tags into the _posts/ directory, I got this error:
File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 1882: ordinal not in range(128)
I don't think I have anything particularly weird in my YAML, only numbers, letters, :, ], /, -,
or does it go through the entire post? I'm not sure how to deal with this, or if my analysis is completely wrong. Any advice would be appreciated!
For me, the reason this was happening is because my _pages/
directory organizes blog posts into subfolders based on the primary category they fall into:
The script as it is written will only work if all of your blog posts are dumped in the _pages
directory. If you want to also traverse all nested subdirectories and process those blog posts, use this:
for dir_name, subdir_list, file_list in os.walk(post_dir):
for file in file_list:
f = open(os.path.join(dir_name, file), 'r', encoding='utf-8')
crawl = False
# rest of the script
to add to the @AleksandrHovhannisyan comment, here is the code that supports subdirectories as well as the list of tags specified with tags
(so you can write things like tags: [one, two, 'first tag', 'second tag']
):
import glob
import os
post_dir = '_posts/'
tag_dir = 'tag/'
file_names = glob.glob(post_dir + '**/*.md', recursive=True)
tags = set()
for file in file_names:
f = open(file, 'r')
inside_header = False
for line in f:
line = line.strip()
if line == '---':
if inside_header:
break # continue to the next file
inside_header = True
if line.startswith('tags:'):
tags_token = line[5:].strip()
if tags_token.startswith('['):
tags_token = tags_token.strip('[]')
new_tags = [l.strip().strip(" "+"'"+'"')
for l in tags_token.split(',')]
else:
new_tags = tags_token.split()
tags.update(new_tags)
f.close()
old_tags = glob.glob(tag_dir + '*.md')
for tag in old_tags:
os.remove(tag)
if not os.path.exists(tag_dir):
os.makedirs(tag_dir)
for tag in tags:
tag_filename = tag_dir + tag + '.md'
f = open(tag_filename, 'a')
write_str = '---\nlayout: tagpage\ntitle: \"Tag: ' + tag + '\"\ntag: ' + tag + '\nrobots: noindex\n---\n'
f.write(write_str)
f.close()
print("Tags generated ({count}): {tags}".format(count=len(tags),
tags=', '.join(tags)))
Tags generated, count 0
It seems to be breaking before reading the tags.
After adding a little debug code
It never gets to here:
Maybe the format of my post file?