luohaha / CSpider

A scalable and convenient crawler framework in C:).
https://github.com/luohaha/CSpider
MIT License
367 stars 98 forks source link

Page manipulations #20

Closed mzer0-yu closed 8 years ago

mzer0-yu commented 8 years ago

Linked list is NOT a valuable data structure. I implemented something like vector. Do ask me if you have any questions.

e.g.

unsigned int queue_id, page_id; queue_id = new_page_queue(512); /* 512 is the capacity of page queue */ page_id = alloc_page_from_queue(queue_id);

cs_page* ptr_to_page = get_page_ptr(page_id); set_page(ptr_to_page, PTR, LENGTH); /* PTR points to the content */

luohaha commented 8 years ago

I don't think page_queue is useful, because it can't add its capacity. And I can't decide what the capacity should be at the beginning.

luohaha commented 8 years ago

And, when I get data using cURL, I can't get whole string once, every time I get a part, and then splice it together.

mzer0-yu commented 8 years ago

I solved it.