Rock-Candy-Tea / hexo-circle-of-friends

Python gets the friend's articles from hexo's friend-links
Apache License 2.0
279 stars 527 forks source link

爬取友联头像与作者错误 #136

Open HeLongaa opened 6 months ago

HeLongaa commented 6 months ago

主题 halo -theme-hao 友联页面格式(部分):

<h2>
                    <a class="headerlink" href="#友情链接-24"
                       title="友情链接(24)"></a>
                    友情链接 (24)
                </h2>

                <div class="flink-desc">每个站点都值得一看</div>

                <!-- 第一个,使用卡片展示 -->

                <div class="flink-list">
                    <div class="flink-list-item">
                        <span style="background-color:#425AEF"
                              class="site-card-tag">荐</span>
                        <a class="cf-friends-link" rel="external nofollow" target="_blank" href="https://dusays.com"
                           title="杜老师说">
                            <img class="flink-avatar cf-friends-avatar" alt="杜老师说"

                                 src="/upload/lyszm17.gif"
                                 data-lazy-src="https://resources.blog.duolaa.asia/img/202402150119899.webp">
                            <div class="flink-item-info no-lightbox">
                                <span class="flink-item-name cf-friends-name">杜老师说</span>
                                <span class="flink-item-desc" title="师者,传道,授业,解惑!">师者,传道,授业,解惑!</span>
                                <img
                                     src="/upload/lyszm17.gif"
                                     data-lazy-src="https://resources.blog.duolaa.asia/img/202402150119899.webp">
                            </div>
                        </a>
                    </div>
                    <div class="flink-list-item">
                        <span style="background-color:#425AEF"
                              class="site-card-tag">荐</span>
                        <a class="cf-friends-link" rel="external nofollow" target="_blank" href="https://blog.zhheo.com/"
                           title="张洪Heo">
                            <img class="flink-avatar cf-friends-avatar" alt="张洪Heo"

                                 src="/upload/lyszm17.gif"
                                 data-lazy-src="https://bu.dusays.com/2022/12/28/63ac2812183aa.png">
                            <div class="flink-item-info no-lightbox">
                                <span class="flink-item-name cf-friends-name">张洪Heo</span>
                                <span class="flink-item-desc" title="分享设计与科技生活">分享设计与科技生活</span>
                                <img
                                     src="/upload/lyszm17.gif"
                                     data-lazy-src="https://bu.dusays.com/2022/12/28/63ac2812183aa.png">
                            </div>
                        </a>
                    </div>

使用butterfly模式爬取,Actions提示:链接,头像,名称长度不一致; 查看数据库发现: image 相邻两个会被识别为同一个头像,还是错误的 建议适配该主题或提供解决方案