Closed profsmucker closed 2 years ago
I checked this docno; there is no content for this one in the given dataset. When I click on the URL, I see the following error.
I used the "get-c4-collection-subset.py" script (already uploaded into this repo) to retrieve all the documents from c4.noclean. I also don't see this document's text in the passages csv and json-lines files I retrieved using the script. So I think the issue might have occurred in processing of the pretty text. @AVakiliT could you help me take a look at this issue? Thank you
The web document contains broken/incorrect html which caused Jericho to fail at properly parsing it (returning an empty string). With cleanup/sanitization before parsing we get the below text.
Cleanup sanitization would also fix the majority of random html tags in other documents. I could try creating a new pretty printed collection where I sanitize before parsing, hopefully it would get rid of these issues that keep cropping up?
Childhood Fever | Baby Care Tips & Informations - Oh Baby Magazine Canada
* About Us
o Our Staff
o Legal Statement
o Privacy Policy
* Advertising
o Media Kit
o Ad Production
o View Magazine
* Membership
* Contact Us
RSS Feed
* Home
* Prenatal
* Newborn
* Toddler
* Oh Mommy
* Food
* Travel
* Blogs
* Cool Stuff
* Contests
* Galleries
Childhood Fever
Kimberly Dare, BSc, ND
Every year during flu season I am swamped with calls from concerned parents
regarding the treatment of a childhood fever. While fevers are upsetting to
parents and uncomfortable for a child, fevers are a natural response to
infection. It is necessary to remind parents that they should trust the
body to fight off infection on its own. I often have to remind my mothers
and fathers that a moderate fever is a good thing!
How high is too high? When do I give Tylenol?
A fever is an indication that the immune system is working. The ability to
generate a fever is a human response that essentially creates an
inhospitable environment for viruses/bacteria. In other words, a higher
body temperature will kill off and prevent the spread of viral and
bacterial infections.
In my opinion it is best not to suppress a fever with antipyretic
medication unless your child’s temperature gets up over 39°C (102°F).
Aspirin should not be used at all to treat childhood fevers. Use products
containing either ibuprofen or acetaminophen such as Tylenol®, Motrin®, and
Tempra®. Use medications that have been specifically formulated for infants
or children, and be sure to follow the dosage instructions.
If your child’s temperature is under 39°C and you choose to give them
medication, you are actually hindering the immunological response to
infection and possibly prolonging your child’s cold or flu. You are also
denying your child the opportunity to develop their immune system, which
down the road will help them fight off more serious infections.
Simple home interventions like tepid cloth compresses applied to the head,
wet sock treatments, dressing your child in light clothing, and lastly
giving your child a bath in lukewarm water are usually enough to prevent
temperatures from climbing too high. Only if the child is shivering should
a sponge bath, wet sock treatment or a tepid bath be avoided. A parent
should aim to maintain and monitor a healthy childhood fever between 38°C –
39°C.
When should I call the doctor?
If your child has a fever and is under 4 months of age, consult a doctor.
If your child is over 4 months of age and has a fever over 40°C, consult a
doctor.
It is important to note that it is very unlikely that your child will have
febrile seizures at temperatures less than 40°C. Only a small percentage of
children have a seizure caused by a fever. These usually occur in children
between 6 months and 6 years of age. It is not only the height of the
fever, but also how rapidly the temperature rises that puts a child at risk
for a seizure. Although these seizures are frightening for parents, they
usually are without serious long-term consequences. If you think your child
has had a seizure during a fever, you should call your doctor immediately.
Again, temperatures greater than 40°C or fevers that last more than 5-7
days warrant medication and a visit to your family physician.
How much fluid should I give to prevent dehydration?
Parents should ensure that their child is getting enough fluid. Having your
child continuously sip water, diluted Gatorade® or Pedialyte®, or soup
broth is essential to preventing dehydration during illness.
If you are particularly worried that your child is not getting enough fluid
you can use a 10ml syringe filled with water/pedialyte every 15 minutes to
treat and prevent dehydration. A good guideline is 50ml-200ml of
fluid/kg/24hr. If your child will not sip fluids of any kind and is showing
signs of dehydration, consult a physician. Signs of dehydration are: dry
lips, dry tongue, sunken eye sockets, concave fontanelles, skin tenting,
profound listlessness, profound weakness, and general unresponsiveness.
Your child’s temperature is not a good indicator of the severity of the
illness or infection. In fact, a parent should use their child’s
disposition and behavior to best judge the severity of the illness and the
need to consult a medical doctor or hospital.
Mild lethargy, weakness, a lack of appetite and a lack of thirst are all
NORMAL. Allow your child the time to recover and rest. You will know they
are on the mend when they are able to get up and play and/or their appetite
returns. Although it is difficult to stand by and watch your child be sick
with a cold or flu – it is the best thing to do. Fever medications should
be used in the latter stages of a fever, if at all.
Kim Dare is a Naturopathic Doctor with a family based private practice in
Hamilton, Onario. Her articles on Family and Pediactric health can be found
on www.babynaturopathics.com a website that promotes organic children’s
clothing and non-toxic toys.
Subscribe to the feed via Email
« Sun Safety Coffee, Tea &Baby: Is Caffeine Safe During Pregnancy? »
Become an
Member
for exclusive contests, articles and promotions!
Avatars by Sterling Adventures
* The Magazine
* Prenatal
* Newborn
* Toddler
* Oh Mommy
* * Food
* Travel
* Blogs
* Directory
* About Us
* Our Staff
* Contact Us
* Legal Statement
* Privacy Policy
* Advertising
* Media Kit
* Ad Production
* View Magazine
* News and Events
* Subscriptions
* Contests
* Deals & Coupons
* Site Map
* Links
© 2019 Baby Magazines Canada, Baby Care Tips, Parents Information
Magazines, Parents Articles Oh baby. All Rights Reserved. Internet
Marketing by TechWyse
@AVakiliT Will you do the cleanup/sanitization for the whole collection? If yes, could you let me know when you're done so that I can retrieve a new subset for the preference judgement system? Thank you
A somewhat bigger concern is the number of such failures that were not caught. There should be a sanity check on the size of the resulting document, or a post-processing check against the length of the WET files.
Ian at NIST has downloaded c4.pretty for use at NIST. If there are many such failures, we need to notify him, or get him a fresh version as soon as possible. (Same goes for our usage.)
@AVakiliT , what is the extent of this issue, or what would it take to create a cleaner collection? At the same time, do we have a way to have jericho produce simple html rather than formatted text? Basically, it seems that all the plaintext has in it are paragraphs and lists, and maybe some approximation of bolding?
For example, what happens with pages when we run them with the method stripInvalidMarkup from http://jericho.htmlparser.net/samples/console/src/HTMLSanitiser.java but we only keep paragraph tags, lists, etc.? Or do we really need some hook into the renderer?
If we don't have a nice way to make simple html, then perhaps we should sanitize and then render and the set the max line length for the render to something like 60 rather than 76 (default) to help fit text on page?
I'm searching for code to make an existing html page into something that is accessible (easy to read for blind people) or can be displayed on all devices (like chrome on mobile will offer to simply a doc), but I'm not finding any.
A new collection where we run sanitization before parsing would fix most of the issues. I ran the jobs over the weekend but a few of them failed and I am rerunning them.
By simple html do you mean something like markdown? A decent markdown renderer should be able to display things nicely on a variety of devices.
Markdown would be ideal
On Tue, Aug 9, 2022 at 5:58 AM Amir Vakili Tahami @.***> wrote:
A new collection where we run sanitization before parsing would fix most of the issues. I ran the jobs over the weekend but a few of them failed and I am rerunning them.
By simple html do you mean something like markdown? A decent markdown renderer should be able to display things nicely on a variety of devices.
— Reply to this email directly, view it on GitHub https://github.com/judgo-preference-judgment/judgo-health-misinformation/issues/16#issuecomment-1209170422, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXZ5FNA53DROLPIJI2IGI3VYITTPANCNFSM55S6LX6Q . You are receiving this because you are subscribed to this thread.Message ID: <judgo-preference-judgment/judgo-health-misinformation/issues/16/1209170422 @github.com>
@MahsaSeifikar what formats will your highlighting, passage selection, etc. work with?
@claclark @MahsaSeifikar Mahsa's highlighting code cannot work across html tag boundaries, and thus it would be bad for us to supply her with html unless her highlighting was smart enough to work across html tags, which I think is likely to be very hard.
How would markdown work?
I have a sample of pretty files for the 2021 qrels with the new requirements.
One notable issue is that websites that are single page applications (using react etc.) have no actual text to render on some urls. Everything is in script tags or being requested on the fly. Which means there is nothing for Jericho to display.
Maybe we could fall back on c4 text?
For react (and other such sites), what does c4 have for them? Is c4 using some sort of alt text?
Where can I see the new pretty files?
c4 used WET files and after checking them, c4 doesn't have extracted text for these pages either. Only the title.
Do you where they are stored? or a running judgo interface?
If c4.noclean lacks text for a document, then we don't have to do any better. Given that they have a title, we should also have a title.
Yes, where should I look to see the samples? I don't need to see them in judgo yet.
All seems good now.
This docno: noclean.c4-train.00974-of-07168.65480
has "nan" as content, but it does have a title and url.
Where did processing of this docno fail?