Open forrestdavis opened 1 month ago
It isn't robust for sure (or something is wrong). Consider the following text using a BERT model with evaluate does not work with MAX_LENGTH
set to 2 minus the max (512). The second chunk ends up size 514.
Subspecies is set in Romania where two American college students Michele (Laura Mae Tate) & Lillian (Michelle McBride) arrive to study local folklore with the aid of local friend Mara (Irina Movila). There they rent rooms in a hotel & become curious about the mysterious ruins of a nearby castle, it turns out that a powerful & evil Vampire named Radu (Anders Hove) lives there who has stolen the Bloodstone from his father King Vladislav (Angus Scrimm). Radu takes a fancy to the three girls & starts drinking the blood of Mara & Lillian, meanwhile Michele falls for a guy named Stefan (Michael Watson) who just so happens to be Radu's brother. Michele & Stefan decide to team up & rid the world of the evil Radu...<br /><br />Directed by Ted Nicolaou this film seems to be quite highly regarded amongst genre fans & while it's not terrible I certainly wouldn't call it very good & I could't really see anything much to get excited about. Subspecies is a rather slow going film, not that much actually happens & while it does try to stay close to certain classic Vampire lore there's all this nonsense about a Bloodstone & some little monsters that grow from the tips of Radu's severed fingers for some reason. Subspecies could have been a half decent film if not for the fact that it's dull, I really can't remember that much about it, good or bad. The character's are alright but some f the dialogue is silly & there's a scene which bugged me near the start when the girls are at the castle ruins & one says they have to go because it's getting dark yet it's still clearly the middle of the day & very bright. There's also a scene where one of the American girls finds a coffin that hotel's attic & doesn't really seem that bothered by it, I am not being funny but is some bloke whose house I was staying at had a coffin in his attic I would be very, very worried if you know what I mean. I don't think I would ever want to watch it again, there's no real threat, the plot is weak that mixes classic Vampire themes with silly subplots & I was distinctly unmoved by it all. Not the worst film ever but hardly the best either.<br /><br />The film looks alright with nice locations & some local scenery although you feel the look is down to the budget rather than the makers attempt a authenticity. There's not much gore apart from a decapitation & some broken off finger tips. For no apparent reason the makers throw in some average looking stop-motion animated monsters that really don't do anything or have much significance to the story.<br /><br />Filmed on the cheap by Charles Band's Full Moon Entertainment production company in Bucharest in Romania, the production values are alright & better than many later day Band productions. The acting isn't great with many of the cast putting in below par performances while genre regular Angus Scrimm has a small cameo at the start. There's a little bit of style here on occasion with a few scene reminding heavily of the original Nosferatu (1922) in particular the bit showing Radu's shadow coming down the stair with his long claw like fingernails standing out.<br /><br />Subspecies is a film that many seem to like for reasons I don't quite see, I thought it was throughly average at best & overall rather dull. Followed by Bloodstone: Subspecies II (1993), Bloodlust: Subspecies III (1994), Subspecies 4: Bloodstorm (1998) & the spin0off film Vampire Journals (1997).
Issue
Validate maximum context length is handled properly
Motivation
At the moment, for contexts larger than the maximum allowed for a fixed length model, the code only partially addresses required special tokens like
[CLS]
and[SEP]
. It would be good to systematically look for a variety of both masked and causal models which require things like beginning of sentence and end of sentence tokens and ensure that the code works for them properly.Your contribution
Demonstrate that the current approach works for a variety of models. You should look at the
by_token_predictability
functions insrc/models/hf_causal_model.py
andsrc/models/hf_masked_model.py
.